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Bright  spot  is  one  of  the  pattern  classes  in  a  seismic  section  and  the  indicators  of 
gas  (hydrocarbon)  accumulation.  In  the  past,  detection  of  bright  spots  depended  primar 
ily  upon  visual  examination  and  the  experience  of  a  geophysicist.  It  is  the  authors 
contention  that  bright  spot  detection  could  be  made  more  confidently  by  computer-aided 
analysis.  This  study  concerns  with  two  computer-aided  methods.  One  is  the  decision- 
theoretic  pattern  recognition,  the  other  is  the  syntactic  or  structural  pattern 
recognition.  Using  these  two  methods,  a  seismogram  is  classified  into  two  classes,  i.e 
bright  spot  and  non-bright  spot. 

In  the  decision-theoretic  pattern  recognition,  three  features  from  seismic  traces 
extracted,  envelope,  instantaneous  frequency,  and  polarity.  Hilbert  transform  theorem 
plays  an  important  role  in  the  analytic  signal  analysis.  Linear  and  tree  classification 
techniques  are  applied.  The  classification  result  provides  candidate  bright  spots. 

The  second  approach  to  detect  candidate  bright  spots  is  the  utilization  of  the 
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syntactic  pattern  recognition  technique.  Tree  classification  is  used  to  extract  the 
pattern  wavelets  of  bright  spot.  Structural  informtion  of  the  pattern  wavelets  of  a 
bright  spot  is  used,  as  are  Levenshtein  distance  computation  and  nearest— neighbor 
decision  rule.  A  threshold  is  determined  from  error  probability  calculation  and  is  use! 
to  detect  candidate  bright  spots. 

Another  factor  affecting  the  detection  of  bright  spot  is  frequency  attenuation.  A 
"partitioning-method"  is  presented.  A  seismogram  is  partitioned  into  small  sections 
and  the  tree  classification  is  performed  in  each  section  to  detect  the  candidate  bright 
spot. 

After  the  candidate  bright  spots  are  determined,  syntactic  pattern  recognition 
technique  is  used  agian  to  recognize  the  string  representation  of  a  bright  spot  in  a 
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two  dimensional  seismogram.  The  final  result  will  indicate  a  bright  spot  or  non-bright 

spot.  . 

It  is  common  in  seismic  signal  analysis  to  use  the  zero-phase  Ricker  wavelets.  This 
study  also  utilizes  these  patterns  in  the  simulation  to  test  the  proposed  techniques  as 
they  are  applied  to  the  relative-amplitude  real  seismogram.  The  classification  results 
obtained  from  these  computer-aided  methods  can  be  used  to  improve  seismic  interpretation 
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ABSTRACT 


Bright  spot  is  one  of  the  pattern  classes  in  a  seismic  section  and 
the  indicators  of  gas  (hydrocarbon)  accumulation.  In  the  past,  detec¬ 
tion  of  bright  spots  depended  primarily  upon  visual  examination  and 
the  experience  of  a  geophysicist.  It  is  the  authors'  contention  that 
bright  spot  detection  could  be  made  more  confidently  by  computer- 
aided  analysis.  This  study  concerns  with  two  computer-aided  methods. 
One  is  the  decision-theoretic  pattern  recognition,  the  other  is  the  syn¬ 
tactic  or  structural  pattern  recognition.  Using  these  two  methods,  a 
seismogram  is  classified  into  two  classes,  i.e.,  bright  spot  and  non- 
bright  spot. 

In  the  decision-theoretic  pattern  recognition,  three  features  from 
seismic  traces  extracted,  envelope,  instantaneous  frequency,  and  polar¬ 
ity.  Hilbert  transform  theorem  plays  an  important  role  in  the  analytic 
signal  analysis.  Linear  and  tree  classification  techniques  are  applied. 
The  classification  result  provides  candidate  bright  spots. 

The  second  approach  to  detect  candidate  bright  spots  is  the  utiliza¬ 
tion  of  the  syntactic  pattern  recognition  technique.  Tree  classification 
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is  used  to  extract  the  pattern  wavelets  of  bright  spot.  Structural  infor¬ 
mation  of  the  pattern  wavelets  of  a  bright  spot  is  used,  as  are  Levensh- 
tein  distance  computation  and  nearest-neighbor  decision  rule.  A  thres¬ 
hold  is  determined  from  error  probability  calculation  and  is  used  to 
detect  candidate  bright  spots. 

Another  factor  affecting  the  detection  of  bright  spot  is  frequency 
attenuation.  A  partitioning-method”  is  presented.  A  seismogram  is 
partitioned  into  small  sections  and  the  tree  classification  is  performed 
in  each  section  to  detect  the  candidate  bright  spot. 

After  the  candidate  bright  spots  are  determined,  syntactic  pattern 
recognition  technique  is  used  again  to  recognize  the  string  representa¬ 
tion  of  a  bright  spot  in  a  two  dimensional  seismogram.  The  final  result 
will  indicate  a  bright  spot  or  non-bright  spot. 

It  is  common  in  seismic  signal  analysis  to  use  the  zero-phase 
Ricker  wavelets.  This  study  also  utilizes  these  patterns  in  the  simula¬ 
tion  to  test  the  proposed  techniques  as  they  are  applied  to  the 
relative-amplitude  real  seismogram.  The  classification  results  obtained 
from  these  computer-aided  methods  can  be  used  to  improve  seismic 
interpretation. 
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CHAPTER  I 

INTRODUCTION 


1.1  Statement  of  the  Problem 

By  1972  many  oil  companies  had  become  successful  in  predicting 
the  occurrence  of  offshore  gas  from  exploration  seismic  reflection  data. 
These  predictions  were  based  on  anomalies  that  would  be  expected  in 
the  high  amplitudes  of  reflections  caused  by  differences  between  the 
reflectivity  of  surfaces  bounding  sands  containing  gas  and  those  bound¬ 
ing  water  or  oil  bearing  portions  of  the  sands.  The  amplitude  of  a 
seismic  wave  reflected  from  an  interface  between  two  layers  of  materi¬ 
als  is  governed  by  the  reflection  coefficient  R  which  is  expressed  for 
normal  incidence  by  the  relation 

R  =  E^r]hXl 

DzVz+D.V, 

where  D j  and  D %  are  the  respective  densities  on  the  near  (incident)  and 
far  sides  of  the  boundary  and  V j  and  V2  are  the  respective  velocities  for 
the  two  sides.  The  product  of  D  and  V  is  known  as  the  acoustic 
impedance,  and  it  is  evident  that  the  reflection  coefficient  and  hence 
the  reflection  amplitude  depend  on  the  change  in  acoustic  impedance 
acioss  the  reflecting  interface.  A  high-amplitude  portion  of  the 
reflection  is  referred  to  as  a  bright  spot  [dob76a]. 
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Some  very  important  indicators  in  predicting  the  occurrence  of  gas 
and  oil  from  real  seismic  reflection  data  are  as  follows  [dob76a,  pan70a, 
pay77a,  she74a,  she75a,  she76a]. 

(1)  High  amplitude  is  related  to  the  high  reflection  coefficient  at 
the  top  boundary  of  the  gas  sand. 

(2)  Low  frequency  wavelets  are  at  the  reflection  of  the  top  boun¬ 
dary  of  gas  sand  zone  because  of  the  high  frequency  attenuation. 

(3)  Phase  reversals  are  produced  by  a  negative  reflection 
coefficient  at  the  top  boundary  of  gas  sand  and  again  by  the  positive 
reflection  coefficient  at  the  bottom  boundary  of  gas  sand. 

Three  examples  of  relative -amplitude  real  seismograms  of  bright 
spot  from  Dobrin  are  shown  in  Fig.  1.1,  1.2,  and  1.3  [dob76a].  The  shape 
of  the  bright  spot  is  a  limited  continuous  reflection  layer  and  may  be 
horizontal,  arched,  or  concave.  The  problem  is  that  the  interpretation 
is  strictly  by  visual  analysis.  A  seismogram  is  interpreted  by  an  inter¬ 
preter  from  his  experience,  geological  information  from  both  ends  of 
the  seismogram,  well  information,  etc.  Interpretation  of  seismic  sec¬ 
tions  is  a  tedious  and  subjective  task  which  challenges  the  exploration 
geophysicists.  Furthermore,  subtle  changes  in  the  nature  of  the 
reflected  signal  often  cannot  be  seen  by  visual  analysis.  Successful 
application  of  automatic  information  extraction  techniques  to  seismic 
exploration  data  would  greatly  relieve  the  burden  of  visual  inspection  of 
large  quantities  of  seismic  section  plots. 

Physical  properties  of  bright  spot  are  used  in  this  study.  The 
bright  spot  in  this  study  is  defined  in  the  following. 

Bright  spot  has  the  following  three  kinds  of  physical  properties  and  a 
continuous  reflection  layer. 


Time 
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Receiving  station 


(Sec.) 


Figure  1.1  Record  section  processed  to  bring  out  relative  amplitudes. 
Bright  spot  at  center  shows  known  gas  accumulation.  (Western  Geophy¬ 
sical  Co.  of  America.)  (After  Dobrin,  1976) 
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Receiving  station 


Figure  1.2  Horizontal  reflections  (at  1.2  and  2.0  s)  indicative  of  gas-oil 
water  contacts.  Note  arched  reflections  above  each  flat  event,  presum¬ 
ably  from  top  of  gas  sand.  (Teledyne  Exploration  Co.)  (After  Dobrin, 
1976) 


Time 


5 


Receiving  station 


(Sec.) 


Figure  1.3  Depression  of  deep  reflectors  under  multiple  gas  zones 
caused  by  reduction  of  velocity  in  gas-bearing  sands.  Note  attenuation 
of  reflections  below  well  at  times  greater  than  1.1  s.  (Continental  Oil 
Co.)  (After  Dobrin,  1976) 
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(1)  The  first  kind  consists  of  the  properties  of  high  amplitude,  low  fre¬ 
quency,  and  negative  polarity. 


(2)  The  second  kind  consists  of  the  properties  of  high  amplitude  and  low 
frequency. 


(3)  The  third  kind  consists  of  the  properties  of  high  amplitude  and 
negative  polarity. 

Data  satisfying  at  least  one  of  these  three  kinds  of  properties  is  called  a 
candidate  bright  spot.  Data  satisfying  the  properties  of  continuous 
reflection  layer  and  one  of  the  three  kinds  is  called  a  bright  spot.  The 
second  and  third  kinds  are  necessary  due  to  the  fact  that  low  frequency 
content  or  polarity  may  not  be  significant  in  the  seismogram. 

The  absolute  magnitudes  of  the  three  physical  properties  may  not 
be  as  important  as  their  relative  values  because  the  reflection  from 
geological  structure  is  case  by  case  or  area  by  area.  The  dominant 
wavelets  may  be  zero-phase  as  zero-phase  Ricker  wavelet  [ric40a,  45a, 
53a]  or  minimum  phase  [rob67a].  Comparing  the  dominant  wavelets  in 
the  seismogram,  the  polarity  can  be  determined. 

Bright  spot  is  one  of  the  pattern  classes  in  a  seismic  section  and 
the  indicators  of  gas  (hydrocarbon)  accumulation.  In  order  to  detect 
bright  spot  more  confidently  by  computer-aided  analysis,  two  methods 
are  proposed.  One  is  the  decision-theoretic  pattern  recognition  and  the 
other  is  the  syntactic  or  structural  pattern  recognition.  Because  the 
physical  properties  of  a  bright  spot  are  relative,  computer-geophysicist 
interaction  is  needed  in  this  study.  The  seismogram  is  classified  into 
two  classes,  i.e.,  bright  spot  and  non-bright  spot. 
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In  the  decision-theoretic  pattern  recognition,  some  testing  traces 
are  selected.  The  testing  traces  may  be  randomly  selected  or  selected 
at  the  high-amplitude  portion  in  the  seismogram.  The  purpose  of  the 
testing  traces  is  for  the  data  reduction  and  to  detect  whether  or  not 
the  candidate  bright  spot  is  in  the  seismogram.  Three  features  are 
extracted:  envelope  corresponding  to  the  amplitude,  instantaneous  fre¬ 
quency  corresponding  to  the  frequency  content,  and  polarity  testing 
the  phase  reversal.  Linear  and  tree  classification  techniques  are 
applied.  The  classification  is  point-by-point.  The  classification  result  is 
the  candidate  bright  spot. 

The  other  approach  to  detect  a  candidate  bright  spot  is  using  the 
syntactic  pattern  recognition  technique.  Testing  traces  are  selected 
from  the  seismogram.  Tree  classification  is  applied  to  detect  the  pat¬ 
tern  wavelets  of  bright  spot  in  the  testing  traces.  Structural  informa¬ 
tion  of  the  bright  spot  wavelets  is  used.  Levenshtein  distance  computa¬ 
tion,  and  the  nearest-neighbor  decision  rule  are  used.  A  threshold  is 
determined  from  error  probability  calculation  and  is  used  to  detect  the 
candidate  bright  spot.  The  classification  is  trace-by-trace. 

Reflection  of  frequency  attenuation  affects  the  detection  of  bright 
spot.  A  "partitioning-method"  is  presented.  A  given  seismogram  is  par¬ 
titioned  into  small  sections.  Tree  classification  is  performed  in  each 
section  to  detect  candidate  bright  spots.  The  major  advantage  of  the 
partitioning-method  is  that  the  overlapping  distribution  of  the  envelope 
and  instantaneous  frequency  can  be  separated. 

After  the  candidate  bright  spots  are  determined,  syntactic  pattern 
recognition  technique  is  used  again  to  recognize  the  string  of  bright 
spot  in  a  two  dimensional  seismogram.  Three  kinds  of  string  distance 
computation  are  proposed  to  test  the  continuity  of  bright  spot  pattern. 
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The  final  result  will  indicate  a  bright  spot  or  non-bright  spot. 

Zero-phase  Ricker  wavelets  are  usually  used  in  the  simulation  of 
the  seismic  analysis  [robQla,  tan79a].  From  the  distribution  of 
envelope  and  instantaneous  frequency  of  Ricker  wavelets,  tree 
classification  is  adopted.  At  first,  obvious  Ricker  wavelets  are  classified. 
Then,  a  simulated  seismogram  using  Ricker  wavelets  is  classified.  At 
last,  real  seismograms  at  Mississippi  Canyon  and  High  Island  are 
detected  to  determine  whether  or  not  there  is  bright  spot  and  where 
the  bright  spot  is  located.  The  relative-amplitude  real  seismograms  in 
this  study  are  provided  by  Mr.  K.  M.  Barry,  Vice  President  of  the 
Teledyne  Exploration  Co. 


1.2  Literature  Survey 


1.2.1  Pattern  Recognition  in  Classification 
of  Seismic  Signals 


Most  works  in  the  analysis  of  seismic  signals  deal  with  discrimina¬ 
tion  between  earthquake  and  nuclear  explosion  events.  Chen  [che77a, 
78a,  82a,  83a]  proposed  a  statistical  pattern  recognition  method  for 
classification  of  earthquake  and  nuclear  explosion  waves.  He  cites  a 
number  of  characteristics  of  seismic  signals  which  are  used  as  discrim¬ 
inants.  Among  these  are  spectral  ratios  for  body  waves  and  surface 
waves,  mb  vs.  Ms  criterion  and  differences  in  P-wave  amplitude  spectra. 
Liu  and  Fu  [hu82a,  82b,  83a]  used  syntactic  approach  to  discriminant 
earthquake  and  nuclear  explosion  waves.  Anderson  [and78a,  82a]  and 
Gaby  [gab83a]  used  syntactic  analysis  in  seismological  waveforms. 
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Tjostheim  [tjo75a,  77a,  78a,  79a]  used  autoregressive  coefficients  as 
features  for  P-wave  seismic  discrimination.  A  seismic  P-wave  can  be 
represented  by  an  autoregressive  model  of  finite  order. 

Sarna  and  Stark  [sar80a,  82a]  also  used  autoregressive  modeling 
for  the  pattern  recognition  of  earthquake  and  explosion  data.  The 
results  are  poor. 

Since  1980,  several  papers  related  to  pattern  recognition  in  seismic 
exploration  have  been  published.  Bois  [boi80a,  81a,  81b,  82a]  used 
autoregressive  coefficients  for  short  trace  sectors  between  the  top  and 
the  bottom  boundaries  of  a  reservoir  and  by  this  he  was  able  to  estab¬ 
lish  a  decision  criterion  to  distinguish  the  trace  sector  corresponding 
to  layers  containing  oil  or  gas  and  those  containing  water  only. 

Hagen  [hag81a]  used  principal  components  of  instantaneous  fre¬ 
quency  for  the  classification  of  regions  of  a  seismic  section  into  porous 
and  non-porous. 

Huang  et.  al.  [hua81b]  decomposed  a  simulated  seismogram  into 
density  and  velocity  distributions  to  detect  bright  spots.  Huang  et.  al. 
[huaSlc]  used  Z ,T ,  likelihood  ratio  and  Chi-squared  tests  and  spatial 
relations  to  obtain  continuous  layer  reflection  coefficients. 

Huang  and  Fu  [hua82a]  used  envelope  and  instantaneous  frequency 
as  features  and  decision-theoretic  pattern  recognition  techniques  for 
the  classification  of  Ricker  wavelets  and  the  detection  of  candidate 
bright  spots.  The  classification  results  are  displayed  in  the  original 
time  traces.  This  presentation  was  very  important  because  it  was  very 
successful  in  real  data  experiment  [hua83a].  Huang  and  Fu  also  pro¬ 
posed  three  hypotheses  and  tree  classification  techniques  to  detect 
candidate  bright  spots  [hua83a,  hua83e].  Huang  and  Fu  presented  a 
partitioning-method  and  tree  classification  in  the  detection  of 
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candidate  bright  spots  for  frequency  attenuated  seismograms  [hua83c]. 

Syntactic  pattern  recognition  had  been  applied  to  character  recog¬ 
nition,  target  detection,  medical  diagnosis,  remote  sensing,  speech 
recognition,  automatic  inspection,  and  identification  of  human  faces 
and  fingerprints  in  the  past  decade  [fu74a,  82a],  Huang  and  Fu 
presented  a  syntactic  pattern  recognition  technique  for  the 
classification  of  Ricker  wavelets  [hua83d]. 


1.2.2Analytic  Signal  Analysis 

Analytic  signal  analysis  has  been  applied  to  seismological  waves  and 
exploration  seismic  signal.  Farnbach  [far75a]  used  analytic  signal 
representation  for  seismological  waves.  A  recent  paper  by  Tanner, 
Koehler,  and  Sheriff  [tan77a,  79a,  80a]  used  complex  trace  (analytic  sig¬ 
nal)  in  exploration  seismic  signal.  Sicking  used  complex  trace  in 
modeling  [sic78a].  Robertson  and  Nogami  used  complex  trace  to  thin 
bed  stratigraphy  [rob81a].  Huang  et.  al.  used  analytic  signal  represen¬ 
tation  for  synthetic  seismogram  of  bright  spots  [hua80a,  81a],  Huang 
and  Fu  presented  analytic  signal  analysis  in  Gaussian  noise  analysis  and 
their  use  in  the  classification  of  Ricker  wavelets  [hua83b]. 


1.3  Organization  of  Thesis 

In  Chapter  2,  features  are  extracted  by  using  analytic  signal 
analysis.  Analytic  signal  analysis  in  AM,  FM,  sinusoidal  signal,  seismic 
Ricker  wavelets,  and  Gaussian  noise,  is  discussed.  Hilbert  transform 
theorem  plays  an  important  role  in  the  analytic  signal  analysis.  The 
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usage  of  the  non-causal  Hilbert  operator  is  also  introduced. 

In  Chapter  3,  the  motivation  of  using  tree  classification  in  the 
detection  of  candidate  bright  spot  is  discussed.  Use  of  linear  classifier 
is  also  discussed.  Three  hypotheses  are  proposed  for  the  constrained 
conditions  to  detect  candidate  bright  spots  similar  to  the  constrained 
condition  in  mathematical  optimization  problem.  Tree  classifier  design 
is  discussed. 

In  Chapter  4,  a  partitioning-method  and  tree  classification  are  pro¬ 
posed  to  detect  candidate  bright  spot  in  order  to  avoid  the  attenuation 
effect.  The  procedures  for  partitioning  a  seismogram  into  small  sec¬ 
tions  are  discussed. 

In  Chapter  5,  syntactic  pattern  recognition  is  used  in  the  detection 
of  candidate  bright  spot.  The  roles  of  likelihood  ratio  test,  optimal 
quantization  encoding,  and  the  probability  of  detection  involving  the 
global,  local  detection  and  threshold  setting  to  detect  the  candidate 
bright  spots,  are  discussed. 

For  Chapter  3,  4,  and  5,  the  results  for.  the  detection  of  candidate 
bright  spots  in  the  simulation  and  real  data  experiments  are  demon¬ 
strated  respectively.  The  classification  results  of  Ricker  wavelets  are 
also  demonstrated. 

In  Chapter  6,  syntactic  pattern  recognition  technique  is  used  again 
to  recognize  the  string  of  bright  spot  in  a  two  dimensional  seismogram. 
The  final  result  will  indicate  a  bright  spot  or  non-bright  spot. 

Finally,  Chapter  7  summaries  the  whole  study  and  proposes  recom¬ 
mendation  for  future  work. 
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1.4  Summary  of  Contributions 

(1)  Several  lemmas  are  derived  for  the  instantaneous  frequency  of 
AM  and  sinusoidal  signals.  The  instantaneous  frequency  of  Ricker 
wavelet  is  inferred  from  the  instantaneous  frequency  of  AM  because  the 
central  part  of  Ricker  wavelet  is  a  cosine  modulated  waveform. 

(2)  Instantaneous  frequency  can  be  used  in  FM  demodulation. 

(3)  Non-causal  Hilbert  operator  is  padded  with  zeros  in  the  middle 
to  avoid  circular  convolution.  The  minimum  number  of  points  of  the 
Hilbert  operator  required  for  a  discrete  signal  with  length  N  is  derived. 

(4)  From  the  analysis  of  zero-phase  Ricker  wavelet,  a  tree 
classification  technique  is  adopted. 

(5)  Some  lemmas  for  the  analytic  signal  analysis  of  Gaussian  noise 
as  the  ground  roll  motion  are  derived  and  provide  the  references  in  the 
tree  classifier  design. 

(6)  Three  hypotheses  are  proposed  for  the  constrained  conditions 
in  the  tree  classification  to  detect  the  candidate  bright  spots  similar  to 
the  constrained  condition  of  mathematical  optimization  problem. 

(7)  The  major  advantage  of  partitioning-method  is  that  the  overlap¬ 
ping  distribution  of  the  envelope  and  instantaneous  frequency  can  be 

separated  from  different  sections  and  attenuation  effects  may  be 
avoided. 

(8)  In  the  syntactic  pattern  recognition  for  the  detection  of  candi¬ 
date  bright  spot,  the  roles  of  likelihood  ratio  test,  optimal  quantization 
encoding,  and  the  probability  of  detection  involving  the  global,  local 
detection  and  threshold  setting  in  the  detection  of  candidate  bright 
spots  are  quite  important. 
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(9)  Syntactic  pattern  recognition  technique  is  used  to  recognize 
the  string  of  bright  spot  in  a  two  dimensional  seismogram.  Three  kinds 
of  string  distance  computation  are  proposed  to  test  the  continuity  of 
bright  spot  pattern. 
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CHAPTER  II 

ANALYTIC  SIGNAL  ANALYSIS:  FEATURE  EXTRACTION 


2. 1  Introduction 

Feature  extraction  is  the  most  important  part  in  pattern  recogni¬ 
tion.  The  physical  meaning  and  separability  power  of  features  are  con¬ 
cerned.  In  order  to  extract  features  from  seismic  signal,  analytic  signal 
analysis  is  applied.  In  this  Chapter,  analytic  signal  analyses  in  AM,  FM, 
sinusoidal  signal,  seismic  Ricker  wavelet,  and  Gaussian  noise  are  dis¬ 
cussed. 

Zero-phase  Ricker  wavelets  are  usually  used  in  the  simulation  of 
seismic  analysis  [rob81a,  tan79a].  The  pattern  wavelet  of  bright  spot  in 
real  data  can  be  compared  with  the  central  part  of  zero-phase  Ricker 
wavelet.  From  Chapter  1,  the  physical  properties  of  bright  spot  is  rela¬ 
tive.  Here,  20 Hz  zero-phase  Ricker  wavelet  is  simulated  as  the 
reflection  wavelet  of  bright  spot.  The  20 Hz  Ricker  wavelet  has  the  phy¬ 
sical  properties  of  high  amplitude,  low  frequency  content,  and  phase 
reversal.  30 Hz  zero-phase  Ricker  wavelet  is  simulated  as  the  reflection 
wavelet  of  non-bright  spot.  At  first,  pattern  recognition  techniques  will 
be  applied  to  the  classification  of  Ricker  wavelets  in  a  simulated  seismic 
trace  and  the  detection  of  candidate  bright  spot  in  the  simulated 
seismogram.  The  one  dimensional  classification  results  can  be  used  for 
two  dimensional  classification  problem.  From  the  classification  result 
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of  simulated  seismogram,  the  detected  portion  should  be  a  continuous 
reflection  layer.  At  last,  pattern  recognition  techniques  will  be  applied 
to  real  seismograms.  These  will  be  described  in  Chapter  3.  In  Chapter 
6,  syntactic  pattern  recognition  will  be  used  to  recognize  the  string  of 
bright  spot.  The  final  result  of  bright  spot  detection  will  be  given. 

From  the  analysis  of  zero-phase  Ricker  wavelet,  a  tree  classification 
technique  is  adopted.  The  analytic  signal  analysis  of  Gaussian  noise  as 
the  ground  roll  motion  provides  the  references  in  the  tree  classifier 
design.  These  will  be  discussed  in  Chapter  3.  The  resolution  analysis  of 
Ricker  wavelet  in  this  Chapter  can  be  used  to  help  the  interpretation  of 
thin-bed  effect  in  Chapter  3. 

Analytic  signal  [gab46a,  fra69a]  has  usually  been  used  in  radar  sig¬ 
nal  analysis,  but  in  recent  years  it  has  become  more  and  more  impor¬ 
tant  in  seismic  signal  processing  [far75a,  tan77a,  tan79a,  hua81a, 
hua82a,  hua83a,  hua83b,  rob82a,  hag81a].  Envelope,  instantaneous 
phase,  and  instantaneous  frequency  are  some  of  the  parameters  in  ana¬ 
lytic  signal  representation.  Envelope  describes  the  shape  of  the  signal. 
Instantaneous  frequency  extracts  the  internal  property  of  the  signal 
and  is  more  important  than  the  other  two.  Hilbert  transform  theorem 
(named  here)  plays  an  important  role  in  the  analytic  signal  analysis.  In 
applications,  the  seismic  zero-phase  Ricker  wavelets  and  sinusoidal  sig¬ 
nals  can  be  considered  as  AM  and  the  instantaneous  frequency  of  these 
signals  can  be  derived.  The  number  of  points  of  Hilbert  operator  that 
should  be  used  are  not  discussed  in  [tan77a,  79a,  80a].  So  the 
minimum  number  of  points  for  the  Hilbert  operator  in  the  discrete 
implementation  will  be  discussed.  Gaussian  bandpass  noise  is  simu¬ 
lated  as  the  ground  roll  motion  in  the  seismic  recording  system.  Pro¬ 
perties  of  the  analytic  signal  in  the  case  of  Gaussian  10  -  60  Hz  white 


16 


noise  will  be  discussed  here.  Different  definitions  of  envelope  function 
and  phase  function  are  derived  in  [tho69a];  the  distributions  are  Ray¬ 
leigh  and  uniform  respectively  in  the  continuous  time  domain.  For¬ 
tunately,  the  distributions  of  envelope  and  instantaneous  phase  using 
analytic  signal  analysis  are  derived  here  as  Rayleigh  and  uniform 
respectively  in  both  discrete  and  continuous  time  domain.  Usually  fre¬ 
quency  discriminator  and  phase  lock  loop  feedback  tracking  techniques 
[zie76a,  gag78a]  are  used  in  FM  demodulation.  But  the  instantaneous 
frequency  technique  presented  here  can  also  treat  FM  demodulation. 
The  classification  of  signal  and  noise  for  the  case  of  high  S/N  using 
envelope  property  is  also  discussed.  Instantaneous  frequency  tech¬ 
nique  treated  in  the  time  domain  can  also  be  used  to  detect  a  hidden 
periodic  signal  when  its  period  is  unknown,  not  using  periodogram 
[blo76a]  or  harmogram  [hin82a].  In  [cra67a],  Cramer  used  the  cosine 
function  as  the  real  part  and  sine  function  as  the  imaginary  part  of  the 
analytic  signal  and  derived  results  for  envelope  and  instantaneous  fre¬ 
quency.  It  is  a  special  case  of  analytic  signal  because  the  imaginary 
part  of  the  analytic  signal  does  not  come  from  the  Hilbert  transform  of 
the  original  signal  and  Hilbert  transform  theorem  is  also  not  used.  For 
non-sinusoidal  signal  like  seismic  Ricker  wavelets  [ric40a,  45a,  53a]  and 
teleseismic  Berlage  function  [far75a,  hua81a],  the  analytic  signal  will  be 
calculated  by  using  Hilbert  transform  [fra69a,  huaSla]. 


2.2  Hilbert  Transfer  Function  and  Hilbert  Transform  Theorem 

The  Hilbert  transfer  pair  [fra69a,  zie76a]  is 

H  sgn  f  <==>  h(t)=  — 

nt 
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Hilbert  transform  (H.T.)  of  sinusoidal  signals  [fra69a,  zie76a]  are  as 
follows: 

H.T.  {cos2tt/  o t  j=sin27r/  0f  H.T.£sin27r/  0t  j  =-cos  2rr/  0t 

An  important  theorem,  Hilbert  transform  theorem  (named  here), 
usually  used  in  the  signal  processing  is  stated  as  follows: 

If  m(t)  and  c(t)  have  bandlimited  spectra  that  are  non-overlapping 
and  if  the  spectrum  of  c(t)  lies  entirely  above  that  of  m(t),  then 

H.T.[m(t)c(t)i=m(t)fH.T.(c(t))i=m(t)dr(t). 

Ziemer  and  Tranter  [zie76a]  have  proved  the  theorem  in  the  time 
domain.  This  theorem  can  also  be  proved  in  the  frequency  domain. 


2.3  Analytic  Signal  and  Its  Representations 

The  analytic  signal  ^(t)  of  a  real  time  function  s(t)  is  a  complex  sig¬ 
nal  [fra69a,  hua81a]: 
f  (t)  =  s(t)  +  j  s  (t) 

where  s(t)  is  the  Hilbert  transform  of  s(t). 

The  analytic  signal  representations  are  defined  as  follows. 

Envelope: 

Env(s(t)|  =  A(t)  =  |^(t)|  =  Vs2(f  )+s2(f) 

Instantaneous  phase: 

T?(t)  =  /_^(t)  =  tan"1 

s(f) 

or  for  f  (t)  =  IV'(t)!  exp(ji?(t)| 

In  f(t)  =  ln|f(t)|  +  j  T?(t) 
then  i?(t)  =  Im[ln  ^(t)] 

Instantaneous  frequency: 
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*w=  ir  ^fL=  f.-1 

dr  2ndt  s(t)  y 


2tt 


s'li(t)+sz(t ) 


or/<(t)~  ars’1®!11’1" «! 


_  jl_ 

27 r 


Imi  ^ln*(t)j  =  U-Im{*j£l) 


27T 


iKO 


2.4  Analytic  Signal  Analysis  in  Sinusoidal  Signals 

Analytic  signal  analysis  is  used  to  extract  features  from  original 
signal.  In  order  to  test  the  power  of  separability  for  the  extracted 
features,  the  use  of  analytic  signal  analysis  in  sinusoidal  signals  is  dis¬ 
cussed.  Suppose  there  are  two  sinusoidal  signals  as  follows. 
s,(0=0.03*cos(27r*20*f)  ,  then  A(0=0.03  ,  fi(t)=20Hz  . 
s2(O=0.02*Cos(27r*30*O  ,  then  A(*)=0.02  ,  fi(t)=30Hz  . 

Envelope  and  instantaneous  frequency  of  the  above  two  sinusoidal  sig¬ 
nals  are  shown  as  above.  In  the  physical  meaning,  envelope  equals  to 
the  maximum  absolute  amplitude  of  the  sinusoidal  signal,  instantane¬ 
ous  frequency  equals  to  the  carrier  frequency  of  the  sinusoidal  signal. 
This  is  in  the  time  domain,  not  in  the  frequency  domain.  The  distribu¬ 
tion  of  envelope  and  instantaneous  frequency  is  shown  in  Fig.  2.1.  From 
Fig.  2.1,  the  power  of  separability  for  the  sinusoidal  signals  is  good. 
Based  on  this  result,  analytic  signal  representations  can  be  used  as  the 
features  in  the  seismic  signal  analysis. 
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Envelope  ( Volts ) 

.0500 


■  0‘fl6 


•  0333 


.0250 


•  0166 


.0083 


0.000  - 

0.000  8.333  16.67  25.00  33.33  *tt  .6?  50.00 

instantaneous  f r equency  {Hz) 
Figure  2. 1  Feature  distribution  of  two  cosine  signals 


Table  2. 1  Hilbert  transform 


t  =  0  t  =  1  t  -  2  t  =  3 


8.333  16.67 


25.00  33.33  *tt  .67  50. 1 

Instantaneous  f 
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2.5  Implementation  in  Discrete  Calculations  and 
Minimum  Number  of  Truncated  Hilbert  Operators 
Required  for  a  Finite  Discrete  Signal  with  Length  N 


From  [gol69a,  opp75a,  rab75a],  a  Hilbert  operator  can  be  written  as 
follows: 


Mn)=^HElMlfornj!0 

71  7T 

=  0  forn  =  0 

s  (n)  is  s(n)  convolving  with  the  Hilbert  operator  h(n).  That  is, 

S(n)  =1-  2  [l-expj7r(n-m)l 

77  m  =— °°,m  71  771 


E  s(m-n) 

7  m  =  0 

=  §~  E  s  (m  -n ) 

7  m  =-00,771  7*0 


r  1— exp  (j7rm)] 

m 

sin2(^L) 

m 


or  s(nr)=T  £  h(nT-A:T)s(/fcT) 

k=-°o 

In  practice,  the  signal  s(k  t  )  has  only  a  finite  extent.  Suppose  that 
s(k  t  )  is  zero  outside  the  index  range  k=0,  1,  •  •  •  ,  N-l. 

N-i 

s  (tit)  =  t  ^  h{nr-kr)s  (kr)  n=0,  1,  •  •  •  ,  N-l 
k  =  o 

V-l 

or  s  (tl t)—t  £  /t(fcT)s(m— A:t)  n=0,  1,  •  •  •  ,  N-l. 

k=-(M- !) 


The  number  of  points  needs  for  applying  Hilbert  operator  is  not  dis¬ 
cussed  in  [tan77a,  79a,  80a].  In  [rab75a],  the  optimum  design  of 
bandpass  Hilbert  transformer  is  derived.  But  the  effect  of  using  N- 
point  finite  signal  is  not  discussed.  If  a  finite  discrete  signal  is  given  as 
(a0>a1,a2,a3),  N=4,  the  Hilbert  transform  can  be  constructed  by  follow¬ 
ing  Table  2.1.  Summation  of  the  slant  elements  equals  to  the  output  at 
,  t— 0,  t-1,  t— 2,  t=3,  •  •  •  etc.  From  Table  2.1,  the  effective  Hilbert 
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operators  at  time  t=0  to  t=3  when  the  signal  is  present  are 
/i_3,h_2.  •  •  •  ,h2,h$.  Seven  points  of  Hilbert  operator  are  used  in  the 
four  point  signal.  If  the  number  of  Hilbert  operator  is  larger  than  7 
points,  the  outputs  at  t=0  to  t=3  are  the  same  as  the  outputs  of  using 
7-point  Hilbert  operator,  although  the  outputs  at  both  ends  outside  the 
interval  (t=0  to  t=3)  are  different.  So  for  a  finite  discrete  signal  with 
length  N,  the  minimum  number  of  the  truncated  Hilbert  operators  is 
2N-1.  There  is  an  example  as  follows. 

Example  1: 

In  Table  2.2,  the  finite  signal  is  given  at  interval  t=ll  to  t=16,  its 
values  are  1,  and  its  length  is  N=6  points.  From  the  above  criterion,  the 
minimum  number  of  points  for  Hilbert  operator  should  be  2N-1  =  11.  If 
a  9-point  (less  than  11)  Hilbert  operator  is  used,  the  outputs  at  t=  1 1  to 
t=16  are  different  from  the  outputs  of  using  an  11-point  Hilbert  opera¬ 
tor.  If  a  13-  or  a  15-  point  (larger  than  11)  Hilbert  operator  is  used,  the 
outputs  at  t=  1 1  to  t=16  will  be  the  same  as  the  outputs  of  using  the  11- 
point  Hilbert  operator,  although  the  tails  outside  the  interval  (t=ll, 
t=16)  are  different. 

It  is  much  faster  to  implement  s(n )  in  the  frequency  domain  using 
the  FFT  algorithm.  In  order  to  avoid  circular  convolution,  signal  s(n) 
should  be  padded  with  sufficient  zeros.  For  non-causal  Hilbert  opera¬ 
tor,  h(ji)  should  be  padded  with  sufficient  zeros  in  the  middle.  Using 
the  base  2  FFT  algorithm,  the  sequences  s(n)  and  h(n)  have  to  be  zero- 
padded  (ZP)  so  that  each  is  (37V— 2)2  elements  long,  where  (37V— 2)2  is 
the  smallest  integer  that  is  a  power  of  2  and  is  greater  than  3N-2. 
Therefore,  the  frequency  domain  implementation  may  be  expressed  as 
s{nr)-r*IFFT  [FFT(s(nr)withZP)  *FFT{h{nr)withZP  in  middle )] 
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Table  2.2  Output  of  Hilbert  operator 


9  point 
Hilbert 
operator 

signal 

output 

1 

.  0000 

.  0 

0000 

2 

.  6366 

0 

0000 

3 

.  0000 

.  0 

0000 

4 

.  2122 

.  0 

.  0000 

5 

.  0000 

.  0 

.  0000 

6 

.  0000 

.  0 

.  0000 

7 

.  0000 

.  0 

0000 

8 

0000 

.  0 

-  2122 

9 

0000 

.  0 

2122 

10 

.  0000 

.  0 

8488 

1 1 

0000 

1  0 

8488 

12 

.  0000 

1.  0 

-  2122 

13 

.  0000 

1  0 

2122 

14 

0000 

1.  0 

2122 

15 

.  0000 

1.  0 

2122 

16 

.  0000 

1.  0 

.  8488 

17 

0000 

0 

.  8488 

18 

0000 

.  0 

.  2122 

19 

.  0000 

0 

2122 

20 

.  0000 

.  0 

0000 

21 

.  0000 

0 

.  0000 

22 

.  0000 

.  0 

.  0000 

23 

0000 

0 

0000 

24 

0000 

0 

0000 

25 

.  0000 

0 

0000 

26 

0000 

.  0 

.  0000 

27 

.  0000 

.  0 

.  0000 

28 

.  0000 

.  0 

0000 

29 

0000 

0 

.  0000 

30 

2122 

.  0 

.  0000 

31 

.  0000 

0 

.  0000 

32 

6366 

0 

0000 

1 

11  point 
Hi  lbert 
operator 
.  0000 

si gnal 

0 

output 

0000 

2 

6366 

0 

.  0000 

3 

.  0000 

.  0 

0000 

4 

.  2122 

0 

.  0000 

5 

.  0000 

0 

0000 

6 

.  1273 

.  0 

1273 

7 

.  0000 

0 

1273 

8 

0000 

.  0 

3395 

9 

.  0000 

.  0 

3395 

10 

.  0000 

.  0 

— .  9762 

1 1 

.  0000 

1.  0 

-  9762 

12 

.  0000 

1.  0 

2122 

13 

.  0000 

1.  0 

2122 

14 

.  0000 

1.  0 

.  2122 

15 

.  0000 

1.  0 

.  2122 

16 

.  0000 

1.  0 

9762 

17 

0000 

.  0 

9762 

18 

.  0000 

.  0 

3395 

19 

.  0000 

.  0 

3395 

20 

.  0000 

0 

1273 

21 

0000 

.  0 

1273 

22 

0000 

.  0 

.  0000 

23 

0000 

.  0 

0000 

24 

.  0000 

.  0 

0000 

25 

0000 

.  0 

0000 

26 

.  0000 

.  0 

.  0000 

27 

.  0000 

.  0 

.  0000 

28 

1273 

.  0 

.  0000 

29 

.  0000 

.  0 

.  0000 

30 

2122 

.  0 

0000 

31 

0000 

.  0 

0000 

32 

6366 

0 

.  0000 

23 


Table  2.2  Continued 


13  point 
Hilbert 


op  era t or 

signal 

output 

i 

.  0000 

0 

0000 

2 

.  6366 

0 

.  0000 

3 

0000 

.  0 

.  0000 

4 

.  2122 

0 

.  0000 

5 

.  0000 

.  0 

0000 

6 

1273 

0 

1273 

7 

.  0000 

0 

1273 

8 

.  0000 

0 

3395 

9 

.  0000 

0 

3395 

10 

0000 

0 

-  9762 

1  1 

0000 

1.  0 

-  9762 

12 

0000 

1  0 

2122 

13 

0000 

1  0 

-  2122 

14 

0000 

1  0 

2122 

15 

.  0000 

1  0 

2122 

16 

0000 

1  0 

9762 

17 

.  0000 

0 

.  9762 

18 

.  0000 

0 

.  3395 

19 

0000 

0 

.  3395 

20 

.  0000 

0 

1273 

21 

0000 

0 

.  1273 

22 

0000 

0 

0000 

23 

0000 

.  0 

0000 

24 

.  0000 

0 

.  0000 

25 

.  0000 

0 

0000 

26 

.  0000 

0 

.  0000 

27 

.  0000 

0 

0000 

28 

1273 

0 

.  0000 

29 

.  0000 

0 

0000 

30 

2122 

0 

0000 

31 

0000 

0 

.  0000 

32 

- .  6366 

0 

0000 

15  point 
Hilbert 
op  erator 

signal 

output 

1 

.  0000 

.  0 

.  0000 

2 

.  6366 

.  0 

.  0000 

3 

.  0000 

0 

.  0000 

4 

.  2122 

.  0 

0909 

5 

.  0000 

.  0 

0909 

6 

.  1273 

0 

-  2183 

7 

0000 

.  0 

2133 

8 

.  0909 

.  0 

4305 

9 

0000 

.  0 

4305 

10 

0000 

0 

-  9762 

1 1 

.  0000 

1.  0 

-  9762 

12 

0000 

1.  0 

-  2122 

13 

.  0000 

1.  0 

2122 

14 

.  0000 

1.  0 

.  2122 

15 

.  0000 

1.  0 

2122 

16 

0000 

1.  0 

.  9762 

17 

.  0000 

.  0 

9762 

18 

.  0000 

0 

.  4305 

19 

.  0000 

.  0 

4305 

20 

0000 

.  0 

2183 

21 

.  0000 

.  0 

.  2183 

22 

.  0000 

.  0 

.  0909 

23 

0000 

.  0 

.  0909 

24 

0000 

.  0 

.  0000 

25 

.  0000 

0 

0000 

26 

-  0909 

0 

.  0000 

27 

0000 

.  0 

0000 

28 

1273 

.  0 

.  0000 

29 

.  0000 

0 

0000 

30 

2122 

.  0 

.  0000 

31 

.  0000 

.  0 

.  0000 

32 

— .  6366 

.  0 

.  0000 

24 


Since  Hilbert  operator  is  not  causal,  when  a  finite  discrete  signal 
convolves  with  Hilbert  operator,  the  output  always  has  small-value  tails 
before  and  after  the  finite  interval  where  the  signal  is  zero  outside  the 
interval.  From  physical  property,  it  is  reasonable  to  assume  that  the 
instantaneous  frequency  is  zero  when  the  signal  is  zero.  So  if  the 
denominator  in  the  discrete  instantaneous  frequency  is  less  than  a 
threshold  (for  example  0.00001),  then  /j(nT)  =  0.  The  other  reason  is  to 
avoid  large  instantaneous  frequency  /^(tit)  for  small  denominator.  In 
the  case  of  negative  instantaneous  frequency  due  to  discrete  calcula¬ 
tion,  they  will  be  set  to  zero  also.  High  peaks  of  instantaneous  fre¬ 
quency  occur  at  both  ends  of  a  finite  interval  signal  because  the  deriva¬ 
tive  will  enhance  the  high  frequency  regions. 


2.6  Instantaneous  Frequency  in  AM  Waveforms 

Several  lemmas  are  presented  in  this  section. 

Lemma  1: 

If  m(t)  is  a  low  frequency  spectrum  and  /  q  is  a  high  carrier  frequency, 
both  spectra  are  non-overlapping,  then  the  instantaneous  frequency  of 
m(t)cos  (2irf  0t),m(t)sin(2-nf  0t),(l+m(t))cos{2nf  0t),  and 

( 1+m  (t))sin  (2nf  0t)  is  /  q,  that  is  the  carrier  frequency. 

Proof : 

Case  1.  s(t)=m  (f)cos27r/0£ 

After  taking  Hilbert  transform, 
s{t)-m  ( t)sin2i\f  Qt 

i/(t)-s(t)+js(t)  =m(t)exj)[j2nf  0t]  /t(0  =  “"M  ~  f 
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The  proofs  of  the  other  cases  are  based  on  the  same  procedure. 

Lemma  2: 

The  instantaneous  frequency  of  AM  delayed  signals  in  Lemma  1, 
m(t)cos(2nf  Q(t-tQ))  etc.,  is  still /0. 

These  two  lemmas  are  used  to  infer  the  instantaneous  frequency  of 
Ricker  wavelets  later. 


2.7  Instantaneous  Frequency  of  Sinusoidal  Waveforms 

with  AM  Applications 

As  in  Lemma  1,  the  sum  of  many  sinusoidal  waveforms  can  be 
derived  for  AM  signals,  then  the  carrier  frequency  of  AM  will  be  the 
instantaneous  frequency.  Several  examples  are  shown  in  Table  2.3.  The 
following  two  lemmas  are  derived. 


Lemma  3: 

If  the  frequencies  of  sinusoidal  waveforms  are  /  \J  2,f  3.  •  •  •  ,/ N  ,  and 
the  sum  of  sinusoidal  waveforms  satisfies  the  properties  of  Lemma  1, 
then  the  instantaneous  frequency  of  this  signal  is  / 


r  _  f  largest  f  smallest 

Ji  g 


Lemma  4: 

From  Lemma  3,  if  the  frequencies  /  lt  f  2,  /3,  •  •  •  fN,  of  the  sinusoidal 
waveforms  form  an  arithmetic  series  and  the  sum  of  these  waveforms 
satisfies  Lemma  1,  then  the  instantaneous  frequency  is 
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Table  2.3  Instantaneous  Frequency  of  the  Sinusoidal  Waveforms 


Signal 

I.F. 

1 

s(t) 

=  COS27T/Of 

/.CL- 

No  baseband 

2 

■( k 

=sin2n/  Qt 

../,0 

3 

s(t) 

=  cos27t/  0*  +  sin2tr/  Qt 

1 

s(t) 

=cos2n/  n  t  +cos2rr/  o(* —£o) 

Delay  case 

2 

s(t) 

=cos2rr/  nt  +  sin27r/  0(*  -*  o) 

3 

s(t) 

=  sin2n/  nt  +sin27r/  o(£-£o) 

/q 

1 

s(t) 

=  acos2rr4/  o t+b  cos2tt6/ 

+  6cos2tt8/o^  +acos2rrl0/o^ 

=  2(acos27T3/  0t  +6  cos27t/  00 

7/o 

•cos2tt7  f  0t 

Sum  of 
sinusoidal 
signal 

2 

S(t) 

-acos2n2f  0t  —b  sin2rr3/ 

+  6  sin27T9/  0t  +  a  cos2tt  1 0/  0t 
=  2(acos2tr4 /  0t  +  6  sin2n3/  0O 
•cos2n6  /  o  t 

6/o 

3 

s(t) 

=  sin27r2 /  0t  +  sin2rr6/  0t 
=  2sin2n4/  Qt  cos27t2/  Qt 

4/o 
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,  _  /  l+f  2+f  3+  •  *  •  +/n 

fi  N 

_  largest  ^  f  smallest 

2 

2.8  Rayleigh  and  Uniform  Distributions 

Rayleigh  and  uniform  distributions  have  been  derived  in  [pap65a]. 
If  two  random  variables  X  and  Y  are  Gaussian,  independent,  with  zero 
mean  and  equal  variance,  then  the  function 

Z=Vj2+y2 

has  a  Rayleigh  distribution  as  follows. 

PZ (2 )' =  -\*xp  [ ~  , 2  >0 

o2  2  a2 

=  0  ,  z<0 

Let  W =tan-1  ~ 

then  random  variable  W  is  uniform  in  the  interval  ( :=^-  — ) 

2  2 

These  results  are  used  in  the  following  analysis. 


2.9  Sinusoidal  Signal  and  Gaussian  Noise  in  Analytic  Signal  Analysis 

The  properties  of  the  noise  in  the  analytic  signal  analysis  are  inves¬ 
tigated  here.  In  order  to  collect  seismic  reflection  signal,  there  is  a 
bandpass  filter  at  the  seismic  receiving  instrument.  The  ground  motion 
noise  also  passes  through  this  bandpass  filter.  The  simulated  Gaussian 
bandpass  white  noise  passing  through  10  -  60  Hz  filter  can  be  written  as 
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[zie75a,  gag78a] 

n(^)=7T-c(0C0S(^c^+'^)~  n,s(£)sm(cjc£+'^)  where  fc=35Hz. 

From  Ziemer  and  Tranter  [zie76a],  nc(f)  is  generated  by 
n(t)2cos(uct+T{/)  passing  through  an  ideal  low-pass  filter  0-25  Hz  and 
ns(£)  is  generated  by  n(£)[— 2sm(oc£  +^)]  passing  through  the  same 
ideal  low-pass  filter.  If  n(f)  is  a  stationary  Gaussian  process,  then  nc  (t) 
and  ns(t)  must  be  Gaussian  processes,  with  mean  zero  and  variance  c r2. 
If  the  power  spectral  density  function  of  the  Gaussian  process  n(£), 
Sn(f  )>  is  10  -  60  Hz  white,  i.e.  symmetric  with  respect  to  fc ,  then  the 
cross-correlation  7?sc(t)=  Rcs(t)=0,  i.e.,  nc(t)  and  ns(t)  are  indepen¬ 
dent  Gaussian  processes  [zie76a,  gag78a].  Gaussian  white  noise  n(£)  is 
filtered  by  the  10-60  Hz  bandpass  filter.  The  carrier  frequency  of  Ti(t) 
is  f c-33Hz .  nc(t )  and  r„s(t)  are  0—25 Hz  low-pass  so  fc-33Hz  lies 
above  the  spectra  of  nc(t )  and  ns(£).  Applying  Hilbert  transform 
theorem  in  Section  2.2,  the  Hilbert  transform  of  n(f ).  n(t),  is 

n(t)=nc(t)sin(uct+f)+  ns(t)cos  {uct +Tp) 

Although  the  same  form  of  n(t)  and  n(t )  is  given  in  [pap65a],  the  gen¬ 
erations  of  nc  ( t )  and  ris(t)  are  different.  In  the  following  derivation,  set 

Tf/=0  for  convenient  calculation.  For  signal,  the  sinusoidal  signal  is 
s  (t)=Scos  (cjjf ). 

Case  1:  Gaussian  10  -  60  Hz  white  noise  process  only. 

Bracewell  [bra78a]  stated  that  "  Regarding  y(t)  as  the  superposi¬ 
tion  of  harmonic  components  in  unrelated  phases,  we  see  that  the  pro¬ 
cess  of  Hilbert  transformation  which  shifts  every  Fourier  component  by 
a  different  amount,  gives  another  but  independent  superposition  of  the 
same  kind.  Based  on  these  assumptions  it  is  thus  found  that  the 
envelope  is  distributed  according  to  a  Rayleigh  distribution,”  where 
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y(t)  71  (0  here.  But  this  statement  lacks  a  mathematical  proof. 
Applying  Hilbert  transform  theorem  to  n(t),  we  can  prove  it  here. 
Substituting  n(t)  and  n(t)  into  the  formula  of  the  analytic  signal 
representations, 

Env(s(t)j  =A(0=  (0  I  =^ri2(0+rj:2(0  = 

where  nc(t)  and  ns(t )  are  N(0,a2  )  and  independent,  so  A(t)  is  a  Ray¬ 
leigh  distribution. 

pA{t)(CL)=-\exP  [-— -jd.a&O 
(T  2  oz 

Lemma  5:  The  envelope  of  Gaussian  bandpass  white  noise  is  a  Rayleigh 
distribution. 


For  instantaneous  frequency, 


f  I.F.  (0  -  1 


n(f) 


dn  (t) 

dt 


•n(0 


2n 


dn  (Q 

dt 


nz(t)+nz(t) 


1  r  ,  nc(t)ns(t)-ns(t)Tic(t)  ^ 
'Lwc  + - 5777" - 577T - J 


2rr  l  c  nz(t)+nz(t) 

So  the  instantaneous  frequency  is  centered  at  the  carrier  frequency  fc. 
The  denominator,  nc  (f  ),  nc  (t ),  ns(t),  and  ns(t)  are  Gaussian. 
nc  (t)+ns2(t)  is  X2(2),  Chi-square  of  2  degree  of  freedom,  but 
nc  )h,s  (0  ~ ns  (Qh'c  (0 

nc2(^)+7^s2(^) 

distribution. 


-is  not  expected  to  have  any  specific  probability 


Case  2.  Sinusoidal  signal  plus  Gaussian  10  -  60  Hz  white  noise  with  the 
same  carrier  frequency  fc  . 
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x(0-s(0+71(0-*5'  coscjc£  +nc  (£)cos(cjef  +ip)—  ns(t)sin(uct 
Envfx(t)j  =  A(t)  =  |TKt)|  =  Vx2(0+x2(0  =  V(5+nc  (0)2+n*z(0 
The  distribution  of  A(t)  is  a  Rician  density  [gag78a]. 

PMt)(a)=^xP[~^^L]I o(^f-),  a>0 

where  /q(  — )  is  the  modified  Bessel  function  of  order  zero, 
or 

For  strong  signal,  Sz»a2 
then,  A(t)  =  S+nc  (f ) 

Hence  A(t)  is  approximately  normal,  with  mean  S  and  variance  <j2: 

For  weak  signal,  S2«a2 

A  ( t) = Vn?(0+n^(0 


which  is  the  same  as  the  Rayleigh  distribution  in  Case  1. 
For  instantaneous  frequency, 


_ at _ 

x2(t)  +  x2(t) 


 1 


2n 


[wc  + 


(5+nc(Q)ns(Q 
(5  +n,c(^))2+ns2(0 


1 


djcjty 

dt 


So  the  instantaneous  frequency  is  always  centered  at  the  carrier  fre¬ 
quency  fc.  The  experimental  results  for  weak  signal  are  shown  in  Fig. 
2.2. 


Case  3.  Sinusoidal  signal  plus  Gaussian  10  -  60  Hz  white  noise  with  car¬ 
rier  frequencies  /  j  and  fc  respectively. 

x(0=s(0+7l(0=<S' coscjj£  +nc  ( t  )cos(cjc  t  +Tp)— ns  ( t)sin  (cjc  t  +^) 

Env[x(t)|  =  A(t)  =  |f(t)| 

=  V(ST  +  m  j(f  +^*(0 


Vi2(f)+i2(i) 
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Figure  2.2A  Weak  signal  (/  t=35 Hz)  plus  Gaussian  noise,  its  envelope  and  instan¬ 
taneous  frequency 


Frequency  of  Occurrence 
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Figure  2.2B  Envelope  distribution  of  Figure  2.2A 


Envelope 

(Kotts) 


(xlOO%) 


Figure  2.2C  Instantaneous  frequency  distribution  of  Figure  2.2A  (Instan¬ 
taneous  frequency  centered  at/c=35tfz  ) 
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where  mi(0  7ic  (f )cos (cjc  cjj)£  ns (t )sin (cjc  —  &;■[)£  and  Gaussian 

N(0,v2). 

For  strong  signal,  S2»cr2 
then,  A(t)  =5’+m1(£) 


Hence  A(t)  is  approximately  normal,  with  mean  S  and  variance  o2: 

-<n~:  <?^2 


For  weak  signal,  S2«v2 


A(0=Vnc2(0+<(0 

So  A(t)  is  a  Rayleigh  distribution. 


For  instantaneous  frequency, 


f  I.F.  (0  - 


dt 


2n 


dt 


V)+x2(t) 


=  1  Mz(t) 
2n  Mi(t) 


where 

M1(t)=S2+n2(t)+n2(t)+2Sm1(t) 

and  if2(0  =  52w1-5 [-nc  (t)sin (cjc  -c^)*  -ris  (£)cos  (Ui-cjc )t  ] 

+  Sucm1(t)+So1m1(t)+ri2(t)uc+nc(t)ns(t)-nc(t)Tis(t)+n2(t)uc 


For  strong  signal,  S2»a2 

f  i p  (t)  =  [cji+  — —  ^  1 

ji.f.kj  2tt  11  M^t)1 
where 


Ms(t)=-cj1n2(t)-u1n2(t)-Su1m1(t) 

“S[-nc  (t  )sin  (wc  -cjj)£  -ns  ( t  )cos  (wj-cj c  )£ ] 

+  Swcm,(*)+nc2(0cjc  +nc(0ns(0-nc(f)ns(0+ns2(0wc 
The  instantaneous  frequency  is  centered  at  the  carrier  frequency  of  the 
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signal,  /j. 

For  weak  signal,  Sz«az 

f  (+\  —  ^  IV,  4. 

fl.F.W)  -  ir—lu c  + 


2  7T 


3 


where 


)  =  ~wc  S'2— S  ocm1(t)+Sza1 
-S[-ric(t)sin(oc-u1)t-ns(t)cos(u1-oc)t] 

+Scj1m1(0+nc  (0ns(0~nc(0ns(0 

So  the  instantaneous  frequency  is  centered  at  the  carrier  frequency  of 
the  Gaussian  noise,  fc .  The  experimental  results  of  high  S/N  are  shown 
in  Fig.  2.3.  The  experimental  results  of  low  S/N  are  shown  in  Fig.  2.4. 

From  Case  3,  for  high  S/N,  instantaneous  frequency  can  be  used  to 
detect  a  hidden  periodic  signal  when  its  period  is  unknown.  This  is  a 
different  way  to  use  periodogram  [blo76a]  and  harmogram  [hin82a]. 


2.10  Rayleigh  Distribution  of  the  Envelope  of  Gaussian  Noise 

in  Discrete  Time  Case 

I.  Discrete  Envelope 

s  (n)  is  the  convolution  of  s(n)  with  Hilbert  operator  h(n). 

The  discrete  envelope  is  defined  as 

^(n)rVs2(n)+s2(n)  =->V  sz(n)7 £*  h(n-k)s(k) 

V  k  =0 

=  "\/  s2(u. )+  2  h(k)s(n—k) 

v  k=-(N- 1) 


n  =  0,l,2,  •  •  •  ,N- 1. 
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(Volts) 


Figure  2.3A  Strong  signal  (f  i=lOHz)  plus  Gaussian  noise,  its  envelope  and 
instantaneous  frequency 


Frequency  of  Occurrence 
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(x!00%) 


Figure  2.3B  Envelope  distribution  of  Figure  2.3A  (  S/N  is  high  )  (^°^s) 


Frequency  of  Occurrence 
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(x!00%) 


Figure  2.3C  Instantaneous  frequency  distribution  of  Figure  2.3A  (  S/N  is 
high ) 
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Figure  2.4A  Weak  signal  {f  x=  10 Hz)  plus  Gaussian  noise,  its  envelope  and 
instantaneous  frequency 


Frequency  of  Occurrence 
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Figure  2.4B 


(x!00%) 
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II.  Rayleigh  Distribution  of  Discrete  Envelope 

Lemma  6: 

Suppose  that  s(n),  n=0,  1,  2,  •  •  •  ,  N-l,  where  N  is  finite,  are  samples 
from  a  Gaussian  independent  noise,  with  zero  mean  and  variance  a2. 
After  Hilbert  transforming,  s(n)  is  Gaussian,  independent  from  s(n)  at 
the  time  n,  and  the  envelope  distribution  in  the  analytic  signal 
representation  is  a  Rayleigh  density. 

Proo f  : 

s(n),  n=0,  1,  •  •  •  ,  N-l,  are  Gaussian  Af(0,o2)  and  independent. 

?(n)  =  S'  A(A:)S(7i-A:)=*2  h(n-fc)s(A:) 
fc=— ( N-l)  k  =o 

s(n)  is  a  linear  combination  of  Gaussian  s(k),  k=0,  1,  •  •  •  ,  N-l,  and  is 
also  a  Gaussian,  with 

Mean  of  s(n)=E[s(n)]  =  ^ h(n-k)E[s  (k)]=0,  and 

k  =0 

Variance  of  s(n)  =  *£*  h2(fc)Far[s(n-fc)]=721/i2(n-fc)Var[s(A:)] 

k  =  -(N- 1)  fc=0 


After  n  5,  Va r[s(n)]-cr2.  So  s  (n)  is  Gaussian  with  N(0, a2) 

The  next  step  is  to  prove  that  s(n)  and  s'(n)  are  independent  from  each 
other  at  the  same  time  t=n. 

Covarianc e(s(n)  ,s(n ))  =  COV(s  (n  )/][>  (n  -k  )s  (/c )) 

fc  =o 

=  COV(s  (n),[h  (n  )s  (0)  +h  (n  —  l)s  ( l)+/i  (n  — 2)s  (2)+  •  •  • 

+/i(0)s(n)+  •  •  •  +h.  (n— (iV  — l))s(A^  —  1)]) 

=  A’[(s(r?,)-0)[h(n)s(0)+  •  •  •  +h(0)s(n)+  •  •  •  +h(n  -(JV-l))s (TV  — 1)— 0]] 
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-0+0+  •  •  •  +£’[s(n)/i(0)s(n)]+0+0+  •  •  •  +0 

=/i(0)£’[s2(n)]  =  0<72=0 

(For  n  jtk  ,  s(n)  and  s(k)  are  independent.) 

So  s(n)  and  s(n)  are  uncorrelated  at  the  same  time  t=n.  Because  s(n) 
and  s(n )  are  Gaussian  N(0,  a2  ).  So  they  are  independent  at  the  time 
t=n. 

Hence,  envelope  A(n)=  Vs2(n)+s2(n)  is  a  Rayieigh  distribution. 


2.11  Instantaneous  Phase  of  Gaussian  Noise 


Lemma.  7: 

The  instantaneous  phase  of  Gaussian  white 
distribution  between 


_  -and  „ 
2  2 


bandpass  noise  is  a  uniform 


Proof : 

(1)  Continuous  time  case 
For  instantaneous  phase, 

iKO^tan-1^- 

n(t) 

n  (*  )=nc  (t) cos(cjc  t  +Tp)—ns  (f  )sin  (cjc  t  +y) 
n  {t  )=nc  (£  )sin(cjc  t  +f)+ns  (t)cos  (cjc  t  +f ) 
E[n(t  )n(t )] 


=£’[(nc  (Ocos(wc  t  +^)-ns (f)sin  (wc  t  +T//))(nc  (t )sin(cjc  t  +f)+ns  (t)cos  (cjc  t  +^))] 
=  RCS  (0)cos2(wc  t  +y)-Rsc  (0)sin2(cjc  t  +^) 

=0  (for  Gaussian  bandpass  white  noise,  Rsc  (t)=Rcs  (t)  =  0  ) 

Because  they  are  Gaussian,  n(t)  and  n(t)  are  independent  at  the  same 
time  t. 
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Thus,  i9(£)=tan  1  \  is  a  uniform  distribution  between  — ^and  — . 

n\t)  2  2 

(2)  Discrete  time  case 

From  Lemma  6,  n(t)  and  n(i)  are  Gaussian,  independent  with  N(0,  a2), 
so  the  instantaneous  phase  is  uniformly  distributed  in  the  interval 

k  2  ’  2  '' 

The  distributions  of  envelope,  instantaneous  phase,  and  instantane¬ 
ous  frequency  of  Gaussian  noise  are  shown  in  Fig.  2.5. 


2.12  FM  Demodulation  Using  Instantaneous  Frequency 


FM  baseband  signal  can  be  demodulated  by  using  instantaneous 
frequency. 

(I)  FM  demodulation  without  noise. 

Lemma  8: 

A  frequency  modulated  waveform  has  the  form 
s  (£)=Acos  [Wc  t  +A  uf  m  (t)dt  ] 
or  s  (£)= Asm  [cjc  £+Aw/77i  (£)d£] 

If  the  carrier  frequency  fc  is  high  and  the  fequency  spectrum  of  m(t ) 
is  low,  i.e.,  J2k(P)=Q&ndJzk+i(l3)=0ivh'en(2k  +  ].)cjm>cjc  for  large  posi¬ 
tive  integer  h,  then  the  instantaneous  frequency  of  this  FM  signal  can 
be  derived  as 

o(t)=oc  +  -^j{Aufm(t)dt] 

=  wc  +Aum  (£) 

°r  /i(«)=/c  +  f^m(0 

and  c) 
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Figure  2.5A 
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Gaussian  noise,  W(0,0.024z)  sampling  interval  =  0.004  seconds 
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Figure  2.5B  Envelope  distribution  of  Figure  2.5A 


Frequency  of  Occurrence 
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(x!00%) 


Figure  2.5C  Instantaneous  phase  distribution  of  Figure  2.5A  by  shifting  — 

2 


Frequency  of  Occurrence  (x!00%) 
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( 


Figure  2.5D  Instantaneous  frequency  distribution  of  Figure  2.5A 
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Proof : 

Let  m(t)=cos  omt ,  A=1 
s  (f )  =Acos  [ac  t  +A uf  cosum  tdt  ] 

=  cos(cjc£  +(8sincjm0  where  (3- 

=  cosuctcos  (0sincjm  t )  -sin cjc  tsin  (/3sinwm  t ) 

DO 

=coscoct [Jo(p)+2  2  J zk (P)cos (2fc um t )] 
fc=l 

DO 

-sin(jei[2  ^  ^2fc  +  i(/S)sin[(2fc  +  l)cjm0] 

fc=0 

where  J2fc(0)and/2fc+i(/3)  are  Bessel  functions. 

From  the  assumption,  JZk(P)=0&n&JZk+1{p)=Qwhen{2k  +  \)um>uc  for 
large  positive  integer  k. 

Then  using  Hilbert  transform  theorem 
H.T.|a(t)|=ff(0 

=smwBf[;0(p)  +  2  2  «/2fc(/3)cos(2fcwmO] 

k  =  1 


+  cosuc*[2  2  ^2fc+i(/S)srn[(2fc  +  l)wm«]] 
fc  =o 

=sm  cjc  £cos  (0sincjm  £  )+coscjc  tsin  (/Ssincjm  £ ) 
=sm(cjc  £  +/3sinum£) 

f(0=s  (t)+js  ( t )  -exp  [j  (cjc  t  +/Ssinum  £  )] 

M  =  ^Im| 

=  /c  +  ^-oosom«  =/c  +  ^m(t) 
then,  nx(0=T^-(/i(^)-/c) 


The  same  procedures  can  be  used  for  s(£)=Asm[cjc£+Au/coscjm£ci£] 

In  the  computer  simulation,  there  are  two  examples  as  follows. 
Example  2: 
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s  (f )  =sin  [2nfc  t  +2tt  fdfm  ( t)dt ] 

where  / c=25Hz  J d=25Hz ,  the  sampling  interval  is  0.001  second  and 
m(t)  is  a  unit  step  function.  The  instantaneous  frequency  of  this  FM  is  a 
step  function  shown  in  Fig.  2.6A. 

Example  3: 

s  (0=sm[27r/c£  +kuf  cosomtdt ]  ,  where  m(t)=cos  amt 
=sin  [2nfc  t  +  -^-sinum  1 1 

=-sin[2T\fct  +  sin2nfmt],  let-^-=l 

“m 

where  fc  =  50 Hz  ,fm  =  5 Hz . 

In  order  to  satisfy  the  conditions  in  Lemma  8,  i.e.,  the  carrier  frequency 
fc  is  high  and  the  frequency  spectrum  of  m(t)  is  low, 
J zk (0)=O  and  Jzk-¥\{P)=^  when (2k  +  l)wm>cjc  for  large  positive  integer  k  , 

fi(t)=fc  +  ~^osumt 

=  fc+fmCOSCJmt 

The  instantaneous  frequency  of  this  FM  is  a  cosine  waveform  shown  in 
Fig.  2.6B.  The  small  variation  of  the  results  shown  in  Fig.  2.6A  and  2.6B 
comes  from  the  finite  discrete  calculation  like  the  FIR  filter  [opp75a, 
rab75a]. 

(II)  FM  demodulation  in  Gaussian  noise. 

In  the  Gaussian  10  -  60  Hz  white  noise  plus  FM  signal, 

Let  m  (f  )  =  cos  cjm  t 
x(t)=s(t)+n(t) 

=Acos  (uc  t  +/3sinwm  t )+nc  (f  )cos(cjc  t  +ip)—ns  ( t  )sin  (cjc  t  +ip) 

After  taking  Hilbert  transform,  we  obtain 

x  ( t  )=Asin  (uc  t  +/3sinwm  t  )+nc  (Osin(cjc  t  +i/)+ns(t)cos  (uc  t  ) 

For  convenience,  set  ^=0. 
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Figure  2.6A  Instantaneous  frequency  of  FM  without  noise,  baseband  sig¬ 
nal  is  a  step  function 
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Figure  2.6B  Instantaneous  frequency  of  FM  without  noise,  baseband  sig¬ 
nal  is  a  cosine  waveform 
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in  which  the  numerator  in  the  instantaneous  frequency  formula 
at  cLt 

=Az(u+pum  cosum  t)-Anz(t )  +A coc n  )  +A  (cjc  +/?o;m  cosom  t)n t(t ) 

+oc  (ricHt)+ns2(t))+nc(t)ns(t)-ns(t)nc( t) 

where  n1(t)=nc  (t)cos  (/?sincj  mt  )+ns(t)sin((3sincjmt ) 
where  nz(t )  =ric  (£  )s in  (/Ssincjm  t )  +'ns  (£  )cos  (0sincjm  t ) 
and  the  denominator  in  the  instantaneous  frequency  formula 

xZ{t)+xz{t)=Az+n?(t)+n2{t)+2AnAt)  =Nl(t) 

^  A>>  o  ,  i.e.  for  strong  signal,  the  instantaneous  frequency  can  be 
derived  as 


f  c  +Pf  mcosamt  +  ir~Nz(t) 

CT\ 


then,  m  (t)=cos umt  = 


7— 


=  Ui  (<)-A 


-^2(0  -I  2rr 
2tt  J 


27T 


where  jV2(f  )= 


^s(Q 

N  ,(f) 


and  7V3(i)-  Anz(t )+Aucnx(t )+A(oc  +flcjmcosomt )ni(£) 

+  qc  (nc2(0+ns2(0)+nc  (On,(f)-n, (*K  (0 

-(wc+/5wmcoscjm^)(i42+nc2(0+n52(^)+2>in1(0) 

^  A<<  a2,  i.e.  for  weak  signal,  the  instantaneous  frequency  can  be 
derived  as 


A(0=/e  + 


^4(0 

27T 


where  7V4(f  )=  — 

NAt) 

and  N5(t)=A2(uc  +(3umcoscjmt)-Anz(t)+Acjcn1(t ) 
+44(wc  +/?cjrrtcoswmOni(0 

+nB(f)ns(0-n*(fKJ(f)--«c  U2+2^n!(0) 
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The  instantaneous  frequency  will  be  centered  at  the  carrier  frequency 
f  c  •  The  signal  will  not  be  demodulated. 

If  the  signal  is  a  sine  carrier,  i.e.  s(f)=i4sin[<yc (t )+jffsinamf],  the 
results  will  be  the  same. 

In  the  computer  simulation,  there  are  two  examples  as  follows. 
Example  4: 

x  (0=s  )+ti  (t )  =Asin  [27 r/ c  t  +  27t/ d fm(t  )dt  ]+n  (t ) 
where  m(t)  is  a  unit  step  function,  f  c=35Hz  ,f  d=25Hz. 

For  strong  signal,  A»  a2,U  =  l,a2=0.052)  ,  the  result  is  shown  in  Fig.  2.7, 
a  step  function. 

Example  5: 

x(t)=s(t)+n(t)=Asin[2nfct+bafm(t)dt]+n(t), 

where  m(t)  =cos  umt,fc  =35  Hz  ,fm -5  Hz. 

For  strong  signal,  A>>  oz,{A  =  l,o2=0.052),  the  result  is  a  cosine 
waveform  and  shown  in  Fig.  2.8. 


2. 13  Envelope  Application  to  Classification  of  Signal 
and  Gaussian  Noise 

In  Fig.  2.9,  there  are  two  hypotheses: 

H0-  x(t)=n(t) 

Hx\  x(t)=s(t)+n(t) 

where  signal  s(t)=S  cose ht  and  Gaussian  noise 

n(^)=^c  (Ocos(wc^  +Tp) -ns(t)svn(uct+f) 

Consider  Hq.  x(t)=n(t)  then  its  envelope  A(t)  as  the  above  property  is  a 
Rayleigh  distribution. 
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Figure  2.8  Instantaneous  frequency  of  FM  with  high  S/N,  baseband  signal 
is  a  cosine  waveform 
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Figure  2.9A  Signal  +  Noise 
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Figure  2.9B  Envelope  of  Figure  2.9A 
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Figure  2.9D  Classification  result  using  envelope=0. 122 
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Mt)=Vn*{t)+n*(t) 

PA{t)(a)~-^xp  [  -  ja2^  ],a>0 

o  2  oc 

Consider  H1:x(t)=s  (t)+n(t),  then  for  strong  signal,  i.e.  high  S/N,  its 
envelope  A(t)  is  approximately  Gaussian. 

PA^a^-^kpxp[:±7J?] 

Using  Bayes  decision  rule, 

P(H0)p(X/  H0)  >  P{H^)p  {X/  //j)  ,  thenX  belongs  to  H0. 

)p  ( X /  Hq)  <  P(H  j)p  (X/  H^)  ,  then  X  belongs  to  Hv 

F or  half  region  is  signal  and  half  region  is  noise,  so  selecting 

P{H o )=P(H j)=  1/  2,  the  Bayes  classifier  is 

■]  a  V2tt  iS"  / r,  £7*  \ 

loge  — - - =  -^-A2a-S) 

o  20* 

For  noise,  cr=0.05,  signal  S=0.2,  then  the  Bayes  classifier  is  a  =  0.122. 
The  classification  result  is  shown  in  Fig.  2.9D.  Compared  with  the  origi¬ 
nal  signal  plus  noise  in  Fig.  2.9A,  the  result  in  Fig.  2.9D  is  quite  good. 

Envelope  in  the  signal  s (£)=£coscjj£  plus  noise  and  noise  have 
Gaussian  and  Rayleigh  distributions  respectively.  So  the  Bayes 
classifier  can  be  determined  easily.  But  instantaneous  frequency  distri¬ 
butions  in  the  signal  plus  noise  and  noise  are  centered  at  /  j  and  f  c 
respectively.  No  model  distribution  can  be  used  and  the  classifier  may 
be  designed  from  histogram  distribution.  So  using  envelope  as  the 
feature  for  the  classification  of  signal  and  noise  is  better  than  using  the 
instantaneous  frequency  for  high  S/N. 


2. 14  Resolution  and  Analytic  Signal  Analysis  in  Ricker  Wavelets 
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The  central  part  of  a  /  —Hz  zero-phase  Ricker  wavelet  [ric40a,  45a, 
53a]  can  be  considered  as  a  double-sideband  modulated  waveform  with 
carrier  frequency  / .  The  carrier  frequency  /  is  higher  than  the 
baseband  signal  and  satisfies  the  Hilbert  transform  theorem.  So  the 
instantaneous  frequency  will  be  approximately  /  Hz.  The  instantane¬ 
ous  frequency  of  25  Hz  Ricker  wavelets  is  shown  in  Fig.  2.10  for  sam¬ 
pling  interval  0.001  second  and  in  Fig.  2.11  for  sampling  interval  0.004 
seconds.  Although  the  maximum  of  the  instantaneous  frequency  of  25 
Hz  Ricker  wavelets  are  different  for  different  sampling  intervals,  they 
are  approximately  equal  to  f  Hz  of  f  —Hz  Ricker  wavelets. 

For  the  resolution  problem,  from  Robertson  and  Nogami  [robBla], 
the  limit  of  resolution  between  two  seismic  wavelets  is  the  sand  thick¬ 
ness  thinning  to  a  quarter  period  of  the  dominant  seismic  wavelet,  i.e., 
the  distance  between  the  peaks  of  the  two  seismic  wavelets  should  be 
larger  than  half  period  of  the  dominant  wavelet.  The  situation  needs  to 
be  investigated  here  is  when  the  peak-to-peak  distance  of  two  seismic 
wavelets  is  less  than  the  half  period  of  the  dominant  wavelet. 

Example  6: 

For  20  Hz  Ricker  wavelet,  /  =20,  T=  -^-=0.05sec. 

20 

then  -^-=0.025sec.  (half  period) 

Sampling  interval  =  0.004  sec. /per  interval 
0  004  ~6-^5~6or7  points  (for  resolution  limit) 

In  Fig.  2.12,  results  of  the  4-point  peak-to-peak  distance  of  two  Ricker 
wavelets  are  shown,  where  the  dominant  20  Hz  zero-phase  Ricker 
wavelet  is  related  to  the  reflection  of  high  amplitude,  low  frequency, 
and  phase  reversal;  the  other  is  30  Hz  Ricker  wavelet.  Some  experi¬ 
ments  are  conducted  when  the  distances  between  two  peaks  of  zero- 


( Volts  ) 


Figure  2.10  Instantaneous  frequency  of  25  Hz  Ricker  wavelet  (  Sampling 
interval  0.001  second  ) 
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( Volts ) 


Figure  2.11  Instantaneous  frequency  of  2 h?  .  . 

interval  0.004  seconds )  y  Hz  Ricker  wavelet  (  Sampling 
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^neureD2\12A  Instantaneous  frequency  and  envelope  of  mixed  20  Hz  and 
30  Hz  Ricker  wavelets  (Phase  reversal  and  4  point  peak-to-peak  distance) 
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phase  20  and  30  Hz  Ricker  wavelets  are  26,  15,  8,  7,  6,  5,  4,  3,  2,  and  1 
points.  The  separability  of  two  Ricker  wavelets  are  quite  good  until  the 
peak-to-peak  distance  reduces  to  6  or  7  points  (half  period).  When  the 
peak-to-peak  distance  is  less  than  6  points,  for  the  dominant  20  Hz 
Ricker  wavelet,  its  instantaneous  frequency  is  not  affected  too  much, 
but  the  peak  of  its  envelope  is  shifted  and  higher  than  that  of  the  origi¬ 
nal  20  Hz  Ricker  wavelet  because  of  the  interference  from  low  ampli- 
tude  17 Hz  Ricker  wavelet. 


2.15  Conclusions 

The  conclusions  of  this  investigation  are  summarized  as  follows. 

(1)  Hilbert  transform  theorem  plays  the  most  important  role  in  the 
analytic  signal  analysis. 

(2)  The  minimum  number  of  points  of  Hilbert  operator  required  for 
a  finite  discrete  signal  with  length  N  is  2N-1.  In  order  to  implement 
s(tl)  in  the  frequency  domain  using  FFT  algorithm  and  avoid  circular 
convolution,  the  minimum  number  of  points  of  signal  s(n)  and  Hilbert 
operator  h(n)  is  (3N—2)2,  where  (3N—  2)2  is  the  smallest  integer  that  is 
a  power  of  2  and  is  greater  than  3N-2.  For  non-causal  Hilbert  operator, 
h(n)  should  be  padded  with  sufficient  zeros  in  the  middle. 

(3)  Instantaneous  frequencies  of  AM  and  sinusoidal  waveform  are 
equal  to  their  carrier  frequencies. 

(4)  The  results  of  signal  and  Gaussian  bandpass  white  noise  in  the 
analytic  signal  analysis  are  as  follows. 

Case  1:  Gaussian  10  -  60  Hz  white  noise  only. 

The  envelope  is  a  Rayleigh  distribution,  the  instantaneous  phase  is  a 
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uniform  distribution  between  -rr/2  and  tt/2.  Both  properties  can  be 
proved  in  the  continuous  and  discrete  time  domains.  The  instantane¬ 
ous  frequency  is  centered  at  the  carrier  frequency  /c. 

Case  2:  Sinusoidal  signal  plus  Gaussian  10  -  60  Hz  white  noise  with  the 
same  carrier  frequency  fc. 

The  envelope  is  a  Rician  density  [gag78a].  For  strong  signal,  the 
envelope  is  a  Gaussian  distribution.  For  weak  signal,  the  envelope  is  a 
Rayleigh  distribution.  For  both  strong  and  weak  signal  cases,  the 
instantaneous  frequency  is  centered  at  the  carrier  frequency  fc 
Case  3:  Sinusoidal  signal  plus  Gaussian  10  -  60  Hz  white  noise  with  car¬ 
rier  frequencies  /  j  and  fc  respectively. 

For  strong  signal,  the  envelope  is  also  a  Gaussian  distribution  and  the 
instantaneous  frequency  is  centered  at  the  signal  carrier  frequency  f  j. 
So  using  instantaneous  frequency  analysis  can  detect  a  hidden  periodic 
signal  for  high  S/N  when  its  period  is  unknown.  For  weak  signal,  the 
envelope  is  also  a  Rayleigh  distribution  and  the  instantaneous  fre¬ 
quency  is  centered  at  the  noise  carrier  frequency  fc . 

(5)  FM  demodulation  using  instantaneous  frequency  is  quite  satis¬ 
factory  for  high  S/N. 

(6)  For  high  S/N,  using  envelope  as  the  feature  for  classification  of 
signal  and  noise  is  better  than  using  the  instantaneous  frequency. 

(7)  The  central  part  of  a  /  —Hz  zero-phase  Ricker  wavelet  can  be 
considered  as  an  approximately  sinusoidal  signal  with  carrier  frequency 
/  .  So  the  instantaneous  frequency  will  be  approximately  /  Hz.  If  the 
peak-to-peak  distance  of  two  Ricker  wavelets  is  less  than  the  half 
period  of  the  dominant  one,  then  its  instantaneous  frequency  is  not 
affected  too  much,  but  its  envelope  is  shifted  and  higher  than  the  origi¬ 
nal  one. 
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CHAPTER  III 

DECISION-THEORETIC  PATTERN  RECOGNITION  FOR  DETECTION 
OF  CANDIDATE  BRIGHT  SPOT 


3.1  Introduction 

Feature  extraction  in  seismic  signal  has  been  discussed  in  Chapter 
2.  In  this  Chapter,  decision-theoretic  pattern  recognition  -will  be  used 
to  detect  candidate  bright  spot. 

As  mentioned  in  Chapter  2,  zero-phase  Ricker  wavelets  are  usually 
used  in  the  simulation  of  seismic  analysis  [robSla,  tan79a].  The  pattern 
wavelet  of  the  bright  spot  in  real  data  can  be  compared  with  the  central 
part  of  the  zero-phase  Ricker  wavelet.  From  Chapter  1,  the  physical 
properties  of  bright  spot  is  known  to  be  relative.  Here,  20Hz  zero- 
phase  Ricker  wavelet  is  simulated  as  the  reflection  wavelet  of  bright 
spot.  The  20 Hz  Ricker  wavelet  has  the  physical  properties  of  high 
amplitude,  low  frequency  content,  and  phase  reversal.  30 Hz  zero- 

phase  Ricker  wavelet  is  simulated  as  the  reflection  wavelet  of  non- 
bright  spot. 

At  first,  Ricker  wavelets  are  classified  by  using  linear  and  tree 
classification  techniques.  Then  a  simulated  seismogram  using  Ricker 
wavelets  is  classified.  The  usage  of  Ricker  wavelets  in  the  simulation  is 
to  test  the  proposed  techniques  as  they  are  applied  to  the  real  seismo¬ 
gram.  At  last,  pattern  recognition  is  used  for  the  detection  of 
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candidate  bright  spot  in  the  real  seismogram. 

The  classification  result  in  this  chapter  also  gives  the  candidate 
bright  spots.  The  use  of  spatial  relation  to  test  the  consistency  of 
reflection  layer  of  bright  spot  is  discussed  in  Chapter  6. 

In  this  chapter,  as  in  [hua82a],  envelope  and  instantaneous  fre¬ 
quency  are  again  used  as  the  features  in  the  classification  of  Ricker 
wavelets  and  an  important  feature,  polarity,  is  used  to  check  the  phase 
reversal  property  of  gas  reflection.  In  the  analysis  of  zero-phase  Ricker 
wavelet,  the  envelope  is  uniformly  distributed  between  one-third  of  the 
maximum  envelope  and  the  maximum  envelope,  and  the  instantaneous 
frequency  is  almost  constant  within  this  interval.  So  the  use  of  tree 
classification  appears  to  be  a  good  choice.  Besides,  the  tree  classifiers 
in  this  case  are  easy  to  design  and  computationally  efficient,  and  tree 
classification  has  been  used  in  remote  sensing  data  [swa77a,  78a, 
wu75a,  you76a].  Three  hypotheses  are  proposed  in  the  tree 
classification  for  detection  of  candidate  bright  spot.  The  first 
hypothesis  is  that  the  candidate  bright  spot  has  high  amplitude,  low 
frequency  content,  and  phase  reversal.  The  second  hypothesis  is  that 
the  candidate  bright  spot  has  high  amplitude  and  low  frequency  con¬ 
tent.  The  third  hypothesis  is  that  the  candidate  bright  spot  has  high 
amplitude  and  phase  reversal.  Linear  classification  is  also  discussed  in 
the  following. 


3.2  Linear  Classification  of  Ricker  Wavelets 
for  the  Detection  of  Candidate  Bright  Spot 

A  seismic  section  always  contains  several  pattern  classes.  In  order 
to  analyze  the  patterns  in  a  seismic  section,  a  decision-theoretic 
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pattern  recognition  system  is  presented  in  Fig.  3.1.  One-dimensional 
simulated  seismic  trace  and  two-dimensional  simulated  seismogram 
are  investigated  here.  The  purpose  of  the  classification  of  Ricker 
wavelets  is  to  determine  how  to  classify  the  pattern  wavelets  in  a 
seismic  trace.  The  one  dimensional  classification  can  be  directly 
applied  to  two  dimensional  classification  problem.  In  a  two-dimensional 
seismogram,  training  traces  are  randomly  selected  or  selected  from 
the  high  amplitude  traces.  Classifiers  are  derived  from  the  features 
distribution  of  the  training  traces.  So  the  classifiers  are  determined 
from  the  physical  properties  of  training  traces.  From  the  classification 
of  simulated  seismogram,  the  results  are  the  candidate  bright  spots. 

3.2.1  Linear  Classification  of  Ricker  Wavelets 

In  a  simulated  seismic  trace  (Fig. 3.2),  three  classes  of  data  are  gen¬ 
erated.  The  20Hz,  30Hz  Ricker  wavelets  and  Gaussian  white  band  10  Hz 

60  Hz  random  noise  are  generated.  The  three  pattern  classes  can  be 
defined  as 

HI  :  r(t)=n(t) 

H2  :  r(t)=20Hz  Ricker  wavelet  +  n(t) 

H3  :  r(t)=30Hz  Ricker  wavelet  +  n(t) 

Using  analytic  signal  analysis,  envelope  and  instantaneous  fre¬ 
quency  are  extracted  from  the  signal  itself  and  used  as  the  two  features 
in  the  decision-theoretic  pattern  recognition  system.  The  envelope 
value  is  multiplied  by  200.0  such  that  the  magnitudes  of  both  envelope 
and  instantaneous  frequency  are  in  the  same  order  of  magnitude. 
From  the  scattering  diagram  of  the  data  distribution  of  the  two 
features  in  Fig.  3.3,  the  three  pattern  classes  can  be  separated. 
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Figure  3.1  Block  diagram  of  a  decision-theoretical  pattern  recognition 
system 


72 


Figure  3.2  Signal,  its  instantaneous  frequency  and  envelope 
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Envelope  *  200 . 


Figure  3.3  Linear  Classifier 
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Figure  3.4  Linear  classification  result  of  Ricker  wavelets 
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Linear  classification  technique  is  used.  A  modified  fixed-increment 
training  procedure  is  presented  in  the  following  which  is  different  from 
the  fixed  increment  training  procedure  [fuk72a]. 

In  a  two-dimensional  feature  space,  suppose  Y  is  a  training  sample. 
The  training  sample  belongs  to  one  of  the  two  classes,  C1  and  C2.  The 
linear  equation  of  decision  boundary  has  weight  vector  W.  Suppose  that 
the  linear  equation  passes  through  a  given  point  (a, 6).  This  given  point 
(a, 6)  is  given  by  a  subjective  selection  or  the  intersection  point  from 
the  two  equations  of  decision  boundary  which  are  solved  by  using  fixed 
increment  training  procedure  [fuk72a,  fu82a]. 


Seta=l,  Initially  set  W  =  0 

.0. 


If  YEClt  YtW> 0,  W  not  change 
If  YzClt  YtW< 0, 


In  Fig.  3.3,  using  the  fixed-increment  and  the  modified  fixed- 
increment  training  procedures,  the  equations  of  decision  boundary  are 
determined  as  follows. 

Here,  g 2z(x,y)  =  0  is  the  equation  of  decision  boundary  between  the 
feature  distributions  of  20 Hz  and  30IIz  Ricker  wavelets,  giz(x,y)= 0  is 
the  equation  of  decision  boundary  between  the  feature  distributions  of 


75 


30Hz  and  Gaussian  noise,  g iz(x >y)—0  is  the  equation  of  decision  boun¬ 
dary  between  the  feature  distributions  of  20 Hz  and  Gaussian  noise. 
Using  the  fixed-increment  training  procedure,  two  equations  of  decision 
boundary  fl'i3(x,t/)=0,  g  2s(x  ,y)= 0,  can  be  found  and  have  intersection 
point  (a, 6).  Then  the  third  equation  of  decision  boundary  g  12(x  ,y)  =  0  is 
determined  by  using  the  modified  fixed-increment  training  procedure, 
i.e.  the  last  element  of  the  weighting  vector  will  be  changed  by 
w  ~  —  (l)a  +tu(2)6)  The  three  equations  of  decision  boundary 
will  intersect  at  one  point  (a, 6)  (Fig. 3. 3). 

The  other  method  is  that  a  point  (a, 6)  is  given  first  and  the  three 
equations  of  decision  boundary  will  pass  this  point  (a, 6)  after  training 
procedure.  It  is  possible  to  give  this  point  at  two  dimensional  space. 
The  convergence  of  this  modified  fixed-increment  training  procedure  is 
very  fast.  In  this  seismic  trace,  the  linear  classification  result  is  shown 
in  Fig.  3.4.  The  central  parts  of  20 Hz  and  30 Hz  Ricker  wavelets  are 
classified,  so  the  classification  result  in  Fig.  3.4  is  quite  good. 

3.2.2  Linear  Classification  for  the  Detection  of 
Candidate  Bright  Spot 


The  one-trace  pattern  recognition  techniques  can  be  directly 
applied  to  2-D  reflection  seismogram  to  classify  the  candidate  bright 
spot  because  the  processing  is  trace-by-trace.  A  typical  geological 
model  of  gas  accumulation  from  Dobrin  [dob76a]  is  shown  in  Fig.  3.5A. 
The  gas  sand  zone  has  density  D=2.025gm  /  cm3,  velocity 
F=  1.737 km/sec  .  The  oil  sand  zone  has  density  D=2.27gm/  cm3,  velo¬ 
city  F=2.225 km /sec.  Above  the  gas  and  oil  sand  zone  is  the  shale 
layer.  The  primary  reflection  synthetic  seismogram  is  generated  in  Fig. 
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Figure  3.5A  Geological  structure 
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3.5B.  The  20  Hz  and  30  Hz  zero-phase  Ricker  wavelets  are  used  in  Fig. 
3.5B  [ric40a,  ric45a,  ric53a].  There  are  64  traces  in  the  seismogram. 
Every  trace  has  512  points.  The  sampling  interval  is  0.004  seconds. 
Gaussian  10  -  60  Hz  white  noise  is  a  simulation  of  recording  of  earth 
ground  roll  motion.  Of  course  there  are  some  sophisticated  modeling 
techniques  to  generate  more  complex  seismograms,  but  the  primary 
reflection  seismogram  is  the  most  efficient  and  the  easiest  one 

corresponding  to  the  geological  structures. 

%  . 

The  thickness  of  the  gas  sand  zone  is  large  because  the  time  to 
pass  this  gas  sand  zone  is  longer  than  one  cycle  of  20 Hz  Ricker  wavelet. 
The  thickness  of  the  gas  sand  zone  considered  in  the  simulated  seismo¬ 
gram  is  related  to  the  resolution  problem  for  the  20 Hz  Ricker  wavelet 
passing  through  the  gas  sand  zone  without  interference  with  the 
reflection  from  the  bottom  layer. 

Two  pattern  classes  are  considered  in  the  simulated  seismogram. 
One  is  the  20 Hz  Ricker  wavelet  at  the  boundary  of  shale  and  gas  sand 
zones  (bright  spot).  The  20 Hz  Ricker  wavelet  has  high  amplitude  pro¬ 
portional  to  the  high  reflection  coefficient,  phase  reversal  and  low  fre¬ 
quency  content  due  to  high  frequency  attenuation  in  the  gas  sand  zone. 
The  other  is  the  non-bright  spot,  the  30 Hz  Ricker  wavelet  at  the  other 
layer  boundary  reflection  and  Gaussian  white  band  10  Hz  -  60  Hz  ran¬ 
dom  noise  corresponding  to  the  random  noise  collected  in  the  field. 
The  envelope  (multiplied  by  200.0)  and  instantaneous  frequency  are 
selected  as  two  features. 

Two  methods  for  the  automatical  selection  of  training  traces  are 
presented  in  the  following. 

(I)  The  training  traces  are  equally  selected  from  the  seismogram  to 
cover  the  reflection  characteristics  of  every  kind  of  layers.  Suppose 
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Figure  3.5B  Synthetic  seismogram  of  bright  spot 
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that  the  number  of  traces  in  the  seismogram  is  N,  then  the  number  of 
training  traces  is 

8 

(II)  The  training  traces  are  selected  from  high  amplitude  traces. 

Algorithm  3.1:  Selection  of  training  traces  from  the  high  amplitude  por¬ 
tion 


Input,  a(i,j)  values  of  seismic  data,  N  traces  of  seismogram  with  m 
points  per  trace. 

Output:  Location  of  training  traces 
Method: 

(1)  Let  the  number  of  training  traces  equal  to  N/8. 

(2)  Calculate  the  power  distribution  p(j)  along  every  trace.  The 
power  of  each  trace  is  calculated  by 

1  TTL 

—Tla2(iJ)=P(j)  j- 1,2,  •  •  •  ,N 

771  i  =  l 

(3)  Use  K-means  algorithm  for  K=2  to  find  the  cluster  centers  of 

two  classes,  m1(  and  m2>  and  let  the  threshold  =  (mi+m 2) 

2 

(4)  Thresholding  p(j )  to  segment  the  whole  trace  into  several 
zones.  Determine  the  first  and  last  points  of  each  zone  and  calculate 
the  length  of  each  zone. 

(5)  Set  weight  2  for  the  power  value  above  the  threshold  and  weight 
1  for  the  value  below  and  equal  to  the  threshold.  Distribute  the  N/8 
training  traces  to  these  zones  equally. 
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Discussion  of  K-means  clustering  analysis  can  be  found  in  [lin80a, 
swa78a,  tou74a],  Two  classes  are  in  the  analysis,  they  are  high  ampli¬ 
tude  traces  and  low  amplitude  traces.  In  Step  (3),  using  K-means  algo¬ 
rithm  for  K=2  can  separate  the  two  classes. 

Using  Algorithm  3.1  in  simulated  seismogram  of  Fig.  3.5B,  the 
power  values  along  time  axis  is  shown  in  Fig.  3.5C,  the  threshold  is 
0.0017.  Three  zones  are  segmented.  From  the  ls£  to  19th  traces  are 
the  first  zone,  from  the  20 th  to  43th  traces  are  the  high  amplitude 
zone,  from  the  46th  to  64 th  traces  are  the  third  zone.  The  location  of 
the  training  traces  are  the  lOfA,  23rd,  28th,  32nd,  36th.,  40th ,  55th 
and  shown  in  Fig.  3.5C.  Training  traces  are  concentrated  on  the  high 
amplitude  traces.  At  the  first  zone  and  third  zone,  only  one  testing 
trace  is  at  each  zone  because  there  is  round  off  error.  The  total 
number  of  training  traces  selected  is  7  in  this  experiment.  The  histo¬ 
gram  of  the  power  data  is  shown  in  Fig.  3.5D.  From  visual  inspection, 
the  threshold  is  0.0017  which  is  the  same  as  the  output  of  Algorithm 
3.1. 

In  the  experimental  study,  method  (I)  is  used.  Training  traces  are 
equally  selected  from  the  seismogram  for  easy  selection  and  conveni¬ 
ence.  The  training  traces  are  the  4th,  12th, 20th, 28th, 36th, 44th,  52nd, 
and  60  th  traces.  Training  samples  are  selected  for  the  values  of 
envelope  larger  than  0.15  in  Fig.  3.6A.  A  linear  classifier  is  used.  The 
modified  fixed-increment  training  procedure  is  applied.  In  the  scatter¬ 
ing  diagram  of  two  features,  a  point  (a, 6)  is  given  first,  the 
classification  boundary  will  pass  through  this  point  (a, 6)  after  the  solu¬ 
tion  weight  vector  is  found.  Classification  result  using  linear  classifier 
is  shown  in  Fig.  3.6B.  Compared  with  original  seismogram  in  Fig.  3.5B,  it 
appears  to  be  quite  good  in  the  detection  of  candidate  bright  spot. 
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Figure  3.5D  Histogram  of  Figure  3.5C 
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Figure  3.6A  Feature  distribution  of  training  samples 
and  linear  classifier 
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Figure  3.6B  Linear  classification  result  of  bright  spot 
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3.3  Thin-Bed  Effect 

In  Fig.  3.9A,  the  time  to  pass  through  the  gas  sand  zone  is  less  than 
the  half  cycle  of  20 Hz  Ricker  wavelet,  the  gas  sand  zone  is  called  "thin- 
bed  .  Using  the  results  from  Section  2.13,  the  experiments  of  thin  bed 
for  the  detection  of  candidate  bright  spot  are  discussed.  Two  simulated 
seismograms  of  the  second  hypothesis  for  thin  bed  of  gas  sand  zone  are 
generated,  i.e.,  high  amplitude  and  low  frequency  at  the  gas  reflection. 
The  simulated  seismogram  for  the  quarter  wavelength  thickness  of  gas 
sand  zone  is  shown  in  Fig.  3.7.  The  simulated  seismogram  for  the  thick¬ 
ness  of  the  2/3  quarter  wavelength  in  gas  sand  zone  is  shown  in  Fig. 
3.9B.  From  both  Fig.  3.7  and  Fig.  3.9B,  although  the  20  Hz  Ricker 
wavelet  at  the  top  of  the  gas  sand  zone  is  mixed  with  the  30  Hz  Ricker 
wavelet  at  the  bottom  of  the  gas  sand  zone,  the  20  Hz  Ricker  wavelet  is 
dominant  and  can  overcome  the  interference,  i.e.,  the  physical  proper¬ 
ties  are  preserved.  Compared  with  the  original  simulated  seismograms 
in  Fig.  3.7  and  3.9B,  the  classification  results  shown  in  Fig.  3.8  and  Fig. 
3.10  respectively  seem  to  be  still  good. 


3*4  Tree  Classification  of  Ricker  Wavelets  for  the 
Detection  of  Candidate  Bright  Spot 


3.4.1  Introduction 

In  the  decision-theoretic  approach,  another  way  of  detecting  bright 
spot  is  using  tree  classification.  Three  kinds  to  detect  the  hypotheses 
of  candidate  bright  spot  in  the  simulated  seismic  traces  and 
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Figure  3.7  Synthetic  seismogram  of  bright  spot  (Thickness  of  gas  sand 
zone  is  quarter  wavelength) 
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Figure  3.9A  Geological  structure 
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Figure  3.9B  Synthetic  seismogram  of  bright  spot  (Thickness 
of  gas  sand  zone  is  2/3  quarter  wavelength) 
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Figure  3.10  Linear  classification  result  of  bright  spots  (  2/3  quarter 
wavelength  at  gas  sand  zone  ) 
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seismograms  are  presented  in  the  following. 

(1)  The  first  hypothesis  is  that  the  bright  spot  has  high  amplitude, 
low  frequency  content  and  polarity  reversal.  Envelope,  instantaneous 
frequency  and  polarity  are  used  in  the  tree  classifier. 

(2)  The  second  hypothesis  is  that  the  bright  spot  has  high  ampli¬ 
tude  and  low  frequency  content.  Envelope  and  instantaneous  frequency 
are  used  in  the  tree  classifier. 

(3)  The  third  hypothesis  is  that  the  bright  spot  has  high  amplitude 

and  polarity  reversal.  Envelope  and  polarity  are  used  in  the  tree 
classifier. 

The  first  hypothesis  of  candidate  bright  spot  is  the  most  restrictive 

one. 

In  the  simulation,  zero-phase  Ricker  wavelets  are  used.  The  same 
procedures  can  be  used  for  minimum-phase  wavelets.  The  purpose  of 
one-dimensional  classification  of  Ricker  wavelets  is  to  determine  how  to 
classify  the  pattern  wavelets  in  a  seismic  trace.  The  block  diagram  of  a 
tree  classification  system  is  shown  in  Fig.  3.11.  From  Fig.  3.11,  for  the 
second  hypothesis,  the  polarity  classification  is  not  used.  For  the  third 
hypothesis,  the  instantaneous  frequency  is  not  used. 

For  an  input  seismogram,  the  data  are  initially  not  known  as  candi¬ 
date  bright  spot  or  non-candidate  bright  spot.  Some  testing  traces  are 
equally  selected  from  the  seismograms,  the 

4th  ,20th  ,2Qth  ,36th  ,44th ,  52nd,  and  60 th  traces;  three  hypotheses 

and  an  unsupervised  clustering  analysis  are  used  to  detect  whether  or 
not  the  testing  traces  have  candidate  bright  spot.  The  candidate  bright 
spot  is  detected  if  one  of  the  three  hypotheses  is  satisfied.  Thresholds 
arc  determined  by  inspection  for  a  good  separability  in  the  scattering 
diagram  of  envelope  and  instantaneous  frequency.  After  designing  the 
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Figure  3.11  Block  diagram  of  a  tree  classification  system 
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tree  classifier  in  terms  of  envelope  and  instantaneous  frequency  from 
the  testing  traces,  a  classification  experiment  is  performed.  Polarity 
classification  is  performed  after  the  classification  in  terms  of  envelope 
and  instantaneous  frequency.  In  real  data  experiment,  the  seismo¬ 
grams  at  Mississippi  Canyon  and  High  Island  are  the  relative-amplitude 
recording  with  bandpass  10  -  50  Hz. 


3.4.2  Definition  of  Polarity 

Envelope  and  instantaneous  frequency  are  defined  in  Chapter  2. 
Polarity  of  zero-phase  Ricker  wavelets  and  minimum-phase  wavelets 
are  defined  as  follows: 

(I)  Suppose  that  a  truncated  symmetric  zero-phase  Ricker  wavelet  has 
the  duration  of  N  points,  ao,^,  ~,  aN_v  Select  N0<N,  where  N0  points 
pass  through  the  center  (extremum)  of  the  Ricker  wavelet. 

i=j+N  0-i 

If  2  <  0,  then  the  polarity  is  positive. 

i=j 

i=j+N  o-l 

If  2  >  0,  then  the  polarity  is  negative. 

i=j 

(II)  Suppose  that  a  truncated  minimum-phase  wavelet  [rob67a]  has  the 
duration  of  Appoints,  a.0,alt  aN-1.  Select  NQ<N,  where  N0  points 
pass  through  the  extremum  of  the  minimum-phase  wavelet. 

i=j  +  N0—\ 

If  2  Q-i  >  0,  then  the  polarity  is  positive. 
i=j 
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i=j+N0- 1 

If  2  <  0.  then  the  polarity  is  negative. 

i-3 


N o  and  j  are  determined  from  the  tree  classification  result  using 
envelope  and  instantaneous  frequency  or  from  the  tree  classification 
result  using  envelope  only. 

3.4.3  Design  of  Tree  Classifier 

From  the  scattering  diagram  of  the  experimental  results  shown  in 
Fig.  3.3  and  Fig.  3.12,  the  envelope  of  a  zero-phase  Ricker  wavelet  is  uni¬ 
formly  distributed  between  one-third  of  the  maximum  envelope  and  the 
maximum  envelope,  and  the  instantaneous  frequency  is  approximately 
constant  within  this  interval.  Inside  this  interval,  the  data  are 
corresponding  to  the  central  part  of  the  Ricker  wavelet.  Without  using 
these  two  simple  properties  in  the  tree  classifier  design,  the  detection 
of  candidate  bright  spot  will  be  a  complex  process.  Analytic  signal 
analysis  in  Gaussian  noise  has  been  discussed  in  Chapter  2.  In  Fig.  2.5, 
the  envelope  of  Gaussian  noise  is  a  Rayleigh  distribution.  In  Fig.  3.3  and 
3.12,  the  distribution  of  envelope  for  Ricker  wavelet  is  unformly  distri¬ 
buted.  So  in  the  classification  of  Ricker  wavelets  plus  Gaussian 
bandpass  noise  N(0,az),  the  threshold  of  envelope  is  determined 
between  Rayleigh  (Gaussian  noise)  and  uniform  (Ricker  wavelets)  distri¬ 
butions.  Usually  the  Bayes  decision  rule  is  used.  For  high  signal  to 
noise  ratio,  the  threshold  of  envelope  is  determined  by  selecting  the 

maximum  between  maximum  envelope  of  Ricker  wavelet  and  3cr  of 

Gaussian  noise.  For  the  data  in  the  scattering  diagram  above  the 
envelope  threshold,  the  threshold  of  instantaneous  frequency  is 
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Figure  3.12A  30  and  20  Hz  Ricker  wavelets  and  its  analytic  signal 
representations 
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Figure  3.12B  Feature  distribution  of  a  seismic  trace 
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determined  by  inspection  if  data  are  separable,  or  by  a  K-means  clus¬ 
tering  analysis  and  maximum  PFS  (pseudo  F-statistics)  value  [vog78a]. 

To  design  a  classifier  in  terms  of  instantaneous  frequency  only,  the 
unsupervised  clustering  analysis  is  reduced  to  one-dimensional 
analysis.  K-means  clustering  algorithm  is  used  for  the  clustering 
analysis.  In  the  first  step,  setting  k=  2,  using  K-means  clustering  algo¬ 
rithm,  the  centers  of  two  classes  are  found.  If  |m1-m2|  is  less  than  3.0 
Hz  of  instantaneous  frequency,  then  the  number  of  class  is  one,  other¬ 
wise  the  number  of  class  is  two  or  more.  In  the  second  step,  if  the 
number  of  class  is  two  or  more,  then  for  different  k  (k> 2),  the  PFS 
value  [vog78a]  is  calculated  to  determine  the  optimum  clustering 
number. 

PFS=  trSb(n-k) 
trSw(k  —1) 

where  Sb  is  the  between-cluster  scatter  matrix  and  Sw  is  the  within- 
cluster  scatter  matrix  [vog78a,  fuk74a,  liu82a]. 

The  maximum  PFS  value  corresponds  to  the  optimal  cluster 
number.  After  the  PFS  analysis,  the  classification  threshold  on  instan¬ 
taneous  frequency  is  determined  by  771  2  _  where  mj  and  are 

the  centers  of  the  leftmost  two  classes. 

In  the  scattering  diagram  of  envelope  and  instantaneous  frequency, 
for  the  case  of  one  class  above  envelope  threshold,  the  bright  spot  is 
detected  by  the  third  hypothesis.  For  the  case  of  two  or  more  classes, 
the  bright  spot  is  detected  by  the  first  or  second  hypothesis. 

Three  examples  are  presented.  The  first  example  is  the  simulation 
of  20  and  30  Hz  Ricker  wavelets  in  a  seismic  trace  in  Fig.  3.2  or  3.17. 
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The  second  example  is  the  simulation  of  20  and  30  Hz  Ricker  wavelets 
in  the  seismogram  in  Fig.  3.5  or  3.18.  The  third  example  is  the  real 
seismogram  analysis  at  Mississippi  Canyon  in  Fig.  3.22.  Table  3.1  shows 
the  clustering  analysis  of  these  three  examples.  For  good  separability 
of  features,  the  threshold  is  determined  by  visual  inspection. 

For  detection  of  candidate  bright  spot  in  two-dimensional  seismo¬ 
grams,  the  testing  traces  are  equally  selected  from  a  seismogram  to 
cover  the  reflection  characteristics  of  every  kind  of  layers.  The  testing 
traces  are  the  4th,  12th, 20th, 28th. ,36th, 44th,  52nd,  and  60th  traces. 
Eight  testing  traces  are  selected  from  a  64-trace  seismogram.  One 
testing  trace  is  selected  from  every  eight  traces.  This  kind  of  selection 
of  testing  traces  is  easy  and  convenient.  Three  hypotheses  and  an 
unsupervised  clustering  analysis  are  used  to  detect  whether  or  not  the 
testing  traces  have  candidate  bright  spot.  The  tree  classifier  design  is 
the  same  as  that  for  one-dimensional  case.  After  determining  the  tree 
classifiers  from  the  testing  traces,  the  whole  seismogram  will  be  pro¬ 
cessed. 


3.4.4  The  First  Hypothesis  and  Its  Corresponding  Tree  Classifier 

There  are  four  classes  in  the  simulated  seismic  trace  (Fig.  3.14A) 
and  seismogram  (Fig.  3.15A): 

Cl:  r(t)=n(t)+20  Hz  Ricker  wavelet  at  top  of  gas  sand  zone  with  polarity 
reversal. 

C2:  r(t)=n(t)+20  Hz  Ricker  wavelet  at  top  of  oil  sand  zone. 

C3:  r(t)=n(t)+30  Hz  Ricker  wavelet  from  other  boundaries. 

C4:  r(t)=n(t)  Gaussian  10-60  Hz  white  noise. 

where  Cl  and  C2  are  related  to  the  high  amplitude,  low  frequency 
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Table  3. 1  Clustering  Results 

(1)  In  Fig. 3. 2.  20  Hz  and  30  Hz  Ricker  wavelets  in  a  simulated  seismic 
trace 


trSb 

trSw 

PFS 

C\2 

1) 

4i 

179.44 

3.28 

710.67 

«-  optimal 

k  -  3 

0.75 

1.975 

549.15 

k  =4 

0.75 

1.975 

335.59 

I.F.  threshold  =  23.66 


(2)  In  Fig. 3.5.  20  Hz  and  30  Hz  Ricker  wavelets  in  synthetic  seismogram 


trSb 

trSw 

PFS 

k  =  2 

879.44 

77.75 

542.93 

«-  optimal 

k  =  3 

914.33 

42.86 

501.35 

k  =  4 

922.70 

34.49 

410.24 

I.F.  threshold  =  24.73 


(3)  In  Fig. 3. 22.  Real  seismogram  at  Mississippi  Canyon 


trSb 

trSw 

PFS 

C\2 

II 

4? 

228.80 

108.23 

71.87 

no  significant 

k  =  3 

275.27 

61.76 

73.54 

optimal 

II 

42 

300.82 

36.21 

88.63 

I.F.  threshold  =  26.69 


Select  k=2,  one  is  bright  spot,  the  other  is  none. 
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Figure  3. 14B  Result  of  tree  classification  (The  first  hypothesis) 
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Figure  3. 15A  The  first  hypothesis  synthetic  seismogram  of  bright  spots 


Figure  3. 15B  Feature  distribution  of  testing  traces 


Figure  3. 15C  Tree  classification  result  of  bright  spots 
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content  and  polarity  reversal  in  gas  and  oil  sand  zones.  Envelope, 
instantaneous  frequency  and  polarity  are  selected  as  the  features  in 
the  tree  classification.  The  binary  tree  classifier  designed  is  shown  in 
Fig.  3.13.  Both  classification  results  of  Ricker  wavelets  and  the  detec¬ 
tion  of  candidate  bright  spot  are  shown  in  Fig.  3.14B  and  3.15C.  Com¬ 
pared  with  the  original  simulated  seismic  trace  in  Fig.  3.14A  and 
seismogram  in  Fig.  3.15A,  the  results  are  quite  good.  In  Fig.  3.14B,  four 
classes  are  classified  by  using  envelope,  instantaneous  frequency,  and 

polarity.  Using  polarity  as  a  feature  can  detect  the  reflections  in  gas 
and  oil. 


3.4.5  The  Second  Hypothesis  and  Its  Corresponding  Tree  Classifier 

There  are  three  classes  in  the  simulated  seismic  trace  (Fig.  3.17A) 
and  seismogram  (Fig.  3.18A)  : 

Cl:  r(t)=n(t)+20  Hz  Ricker  wavelet  at  top  of  gas  sand  zone  with  polarity 
reversal. 

C2:  r(t)=n(t)+30  Hz  Ricker  wavelet  from  other  boundaries. 

C3:  r(t)=n(t)  Gaussian  10-60  Hz  white  noise. 

where  Cl  is  related  to  the  high  amplitude,  low  frequency  content  and 
polarity  reversal  in  gas  sand  zone.  Envelope  and  instantaneous  fre¬ 
quency  are  selected  as  the  features  in  the  tree  classification.  The 
binary  tree  classifier  designed  is  shown  in  Fig.  3.16.  Classification 
results  of  Ricker  wavelets  and  the  detection  of  candidate  bright  spot 
are  shown  in  Fig.  3.17B  and  3.18C  respectively.  Compared  with  the  ori¬ 
ginal  simulated  seismic  trace  in  Fig.  3.17A  and  seismogram  in  Fig. 
3.18A,  the  results  are  quite  good.  In  Fig.  3.17B,  three  classes  are 
classified  by  using  envelope  and  instantaneous  frequency.  Envelope  and 
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Figure  3.16  Tree  classifier  of  the  second  hypothesis 
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Figure  3.17A  Signal,  its  instantaneous  frequency  &  envelope 
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Figure  3.17B  Result  of  tree  classification  (The  second  hypothesis) 
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Figure  3.18A  The  second  hypothesis  synthetic  seismogram  of  bright  spots 


Figure  3.18B  Feature  distribution  of  testing  traces 
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Figure  3.18C  Tree  classification  result  of  bright  spot 
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polarity  can  also  be  selected  as  the  features  for  this  kind  of  hypothesis 


3.4.6  The  Third  Hypothesis  and  Its  Corresponding  Tree  Classifier 

There  are  three  classes  in  the  simulated  seismic  trace  (Fig.  3.20A) 
and  seismogram  (Fig.  3.21A): 

Cl.  r(t)-n(t)+25  Hz  Ricker  wavelet  at  top  of  gas  sand  zone  with  polarity 
reversal. 

C2.  r(t)-n(t)+25  Hz  Ricker  wavelet  from  other  boundaries. 

C3:  r(t)=n(t)  Gaussian  10-60  Hz  white  noise. 

where  Cl  is  related  to  the  high  amplitude  and  polarity  reversal,  but  no 
low  frequency  content  in  the  gas  sand  zone.  Envelope  and  polarity  are 
selected  as  the  features  in  the  tree  classification.  The  tree  classifier 
designed  is  shown  in  Fig.  3.19.  Classification  results  of  Ricker  wavelets 
and  the  detection  of  candidate  bright  spot  are  shown  in  Fig.  3.20B  and 
3-21C  respectively.  Compared  with  the  original  simulated  seismic  trace 
in  Fig.  3.20A  and  seismogram  in  Fig.  3.2 1A,  the  results  are  quite  good. 
In  Fig,  3.20B,  three  classes  are  classified  by  using  envelope  and  polarity. 
Using  polarity  as  the  feature  can  detect  the  polarity  reversal  in  the 
Ricker  wavelet  of  a  seismic  trace  and  the  candidate  bright  spot  of  the 
seismogram. 


3.4.7  Real-Data  Experiment 

Two  relative  amplitude  seismograms  are  processed. 

(I)  Data  from  Mississippi  Canyon 

In  Fig.  3.22,  at  0.2  and  1.4  seconds,  the  dominant  wavelets  are  the 
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Figure  3.19  Tree  classifier  of  the  third  hypothesis 


Figure  3.20A  Signal,  its  instantaneous  frequency  and  envelope 


Output  value 


n 

-.«oo  - 

o.ooo  .o»*  .ton  .«•*  .mr  .ten  .1000 


Figure  3.20B  Result  of  tree  classification  (The  third  hypothesis) 
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Figure  3.21A  The  third  hypothesis  synthetic  seismogram  of  bright  spots 
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Figure  3.21B  Feature  distribution  of  testing  traces 


109 


Figure  3.21C  Tree  classification  result  of  bright  spots 
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minimum-phase  wavelets.  The  tree  classifier  designed  is  the  same  as 
that  in  Fig.  3.13  for  the  first  hypothesis.  The  classification  result  is 
shown  in  Fig.  3.22C.  Compared  with  the  real  data  in  Fig.  3.22,  the  result 
is  easy  to  be  interpretated.  The  candidate  bright  spot  is  at  1.4  seconds 
which  is  the  most  probable  to  accumulate  gas.  The  left-hand  side  of  1.4 
seconds  with  positive  polarity  comes  from  the  interference  effect  of 
thin  bed.  The  high  frequency  contents  at  1.3  seconds  and  the  neighbors 
of  the  20th.  trace  have  been  eliminated  using  the  instantaneous  fre¬ 
quency  threshold. 

(II)  Data  from  High  Island 

In  Fig.  3.23,  at  1.7  seconds,  the  dominant  wavelets  are  the  zero- 
phase  wavelets.  From  the  scattering  diagram  of  envelope  and  instan¬ 
taneous  frequency  in  the  testing  traces,  the  low  frequency  content  is 
found  to  be  not  significant.  The  tree  classifier  designed  is  the  same  as 
that  in  Fig.  3.19  for  the  third  hypothesis.  Envelope  and  polarity  are 
selected  as  the  features  in  the  tree  classification.  The  classification 
result  is  shown  in  Fig.  3.23C.  The  candidate  bright  spot  is  at  1.7 
seconds  which  is  the  most  probable  to  accumulate  gas.  At  the  middle 
part  of  the  layer  of  1.7  seconds,  the  reflection  of  the  positive  polarity  is 
the  most  probable  to  accumulate  oil/water.  Without  using  the 
classification  technique,  the  candidate  bright  spot  may  be  interpre¬ 
tated  at  the  middle  part  of  the  layer  of  1.7  seconds  by  using  visual 
inspection. 


3.5  Conclusions 


Ill 


Figure  3.22A  Relative  amplitude  seismogram  at  Mississippi  Canyon  (Nega¬ 
tive  on  the  Right) 


Figure  3.22B  Feature  distribution  of  testing  traces 
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Figure  3.22C  Tree  classification  result  of  bright  spots 
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Figure  3.23A  Relative  amplitude  seismogram  at  High  Island  (Negative  on 
the  right) 


Figure  3.23B  Feature  distribution  of  testing  traces 
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Figure  3.23C  Tree  classification  result  of  bright  spots 
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(1)  The  advantages  of  the  modified  fixed-increment  training  pro¬ 
cedure  are  that  the  convergence  is  very  fast  and  the  decision  boun¬ 
daries  are  intersected  at  one  point. 

(2)  From  the  analysis  of  zero-phase  Ricker  wavelets,  the  use  of  tree 
classification  is  proposed.  The  advantages  of  the  tree  classification  are 
easy  to  design  and  computationally  efficient.  As  a  matter  of  fact,  the 
design  of  the  tree  classifier  is  easier  than  that  of  the  linear  classifier. 

(3)  The  envelope  threshold  is  determined  from  the  uniform  distri¬ 
bution  of  wavelets  and  Rayleigh  distribution  of  Gaussian  noise.  The 
threshold  of  the  instantaneous  frequency  is  determined  by  visual 
inspection  if  data  are  visually  separable  or  by  an  unsupervised  cluster¬ 
ing  analysis. 

(4)  For  high  signal  to  noise  ratio,  using  envelope  is  easy  to  separate 
signal  plus  noise  (Ricker  wavelets  plus  noise,  seismic  data)  and  noise 
alone.  So  envelope  is  selected  as  the  feature  for  the  first-level  tree 
classification.  Instantaneous  frequency  is  selected  as  the  feature  for 
the  second-level  tree  classification.  Polarity  is  used  after  the 
classifications  in  terms  of  envelope  and  instantaneous  frequency. 

(5)  In  real  data,  the  wavelet  is  not  exactly  the  minimum-phase  or 
zero-phase  wavelet.  Since  a  major  part  of  the  wavelet  is  close  to  the 
central  part  of  Ricker  wavelet,  so  a  tree  classifier  can  be  used. 

(6)  Envelope,  instantaneous  frequency  and  polarity  have  been  used 
as  the  features  for  tree  classification  in  the  simulated  and  real 
relative-amplitude  seismograms.  The  detected  candidate  bright  spot 
satisfies  one  of  the  three  hypotheses.  The  classification  results  are 
quite  - good  and  can  be  used  to  improve  seismic  interpretation.  The 
detected  candidate  bright  spot  (gas  reflection)  is  clearly  shown. 
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(7)  For  the  linear  classification  of  simulated  seismogram  in  this 
Chapter,  training  samples  have  good  separability  and  linear  classifier 
can  be  determined.  But  tree  classifiers  of  envelope  and  instantaneous 
frequency  can  even  be  designed  by  visual  inspection.  In  the  real  data, 
the  testing  samples  are  not  known  as  bright  spot  or  non-bright  spot. 
Linear  classifier  can  not  be  used.  For  the  real  data,  the  pattern  of 
reflection  wavelet  approximately  to  the  central  part  of  zero-phase 
Ricker  wavelet.  From  the  analysis  of  Ricker  wavelet,  tree  classification 
is  proposed.  Envelope  can  be  used  to  classify  high  amplitude  and  low 
amplitude  portion.  Instantaneous  frequency  can  be  used  to  classify  low 

frequency  and  high  frequency  contents.  So  a  tree  classifier  is  easy  to 
design  and  efficient. 
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CHAPTER  IV 

DETECTION  OF  CANDIDATE  BRIGHT  SPOT  IN  FREQUENCY  ATTENUATED 
SEISMOGRAM  USING  A  PARTITIONING-METHOD  AND  TREE  CLASSIFICATION 


4.1  Introduction 

There  are  many  attenuation  effects  in  a  seismogram,  quality  factor 
Q  being  the  dominant  one  [car81a,  fut62a,  gar64a,  gar74a,  ham72a, 
johSla,  kja79a,  obr61a  ,  sch78a,  str70a,  tro62a,  wyl62a].  The  most  com¬ 
monly  used  measures  of  attenuation  found  in  the  literature  are  the 
attenuation  coefficient  a  which  is  the  exponential  decay  constant  of  the 
amplitude  of  a  plane  wave  traveling  in  a  homogeneous  medium;  the 
quality  factor  Q  and  its  inverse  sometimes  called  internal  friction 

or  dissipation  factor  [tok81a],  The  quantities  are  related  as  follows: 

1  _  av 

Q  nf 

where  v  is  the  velocity  and  /  is  the  frequency. 

Frequency  attenuation  effect  may  appear  in  a  seismogram.  A 
bright  spot  has  important  indicators  of  high  amplitude,  low  frequency 
content,  and  phase  reversal.  But  reflections  from  other  layers  in  a 
seismogram  probably  also  contain  some  of  these  indicators,  especially 
the  low  frequency  content.  Consequently,  the  decision-theoretic  pat¬ 
tern  recognition  techniques  proposed  in  Chapter  3  to  classify  the  whole 
seismogram  often  may  not  be  effective  enough.  A  “partitioning- 
method  is  proposed  in  this  Chapter.  The  basic  idea  is  that  bright  spots 


118 


having  low  frequency  content  is  only  absolutely  true  in  a  small  section 
of  a  seismogram.  Block  diagram  of  the  partitioning-method  and  tree 
classification  system  is  shown  in  Fig.  4.1.  A  seismogram  is  first  parti¬ 
tioned  into  small  sections.  For  each  section,  three  hypotheses  and 
their  corresponding  tree  classification  techniques  (described  in 
Chapter  3)  are  used  to  detect  the  candidate  bright  spot.  The  major 
advantage  of  the  partitioning-method  is  that  the  overlapping  distribu¬ 
tions  of  the  envelope  and  instantaneous  frequency  in  different  sections 
can  be  separated,  i.e.,  the  distribution  of  envelope  and  instantaneous 
frequency  from  one  seismic  section  is  not  disturbed  by  those  of  other 
seismic  sections.  The  design  of  the  tree  classifier  in  one  seismic  sec¬ 
tion  is  therefore  easier  than  that  for  the  whole  seismogram.  The  com¬ 
puter  time  for  processing  a  small  section  is  also  shorter  than  that  for 
processing  the  whole  seismogram.  The  experiment  on  a  real  seismo¬ 
gram  in  Mississippi  Canyon  is  presented. 


4-2  Procedure  of  Partitioning-Method  and  Tree  Classification 

From  the  block  diagram  of  the  partitioning-method  and  tree 
classification  system  in  Fig.  4.1,  five  steps  are  described  as  follows. 

(1)  Envelope  and  instantaneous  frequency  are  extracted  as  the 
features  from  the  input  seismogram. 

(2)  The  seismogram  is  partitioned  into  small  sections  by  using  a 
boundary  determination  procedure,  of  which  the  detailed  steps  are  dis¬ 
cussed  in  Section  4.3. 

(3)  Testing  traces  are  selected  according  to  the  method  (I)  or  (II) 
described  in  Section  3.2.2.  Testing  traces  are  equally  selected  or 
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selected  at  high  amplitude  traces  from  each  section.  The  three 
hypotheses  and  tree  classification  techniques  in  Chapter  3  are  used  to 
detect  which  sections  contain  candidate  bright  spot. 

(4)  Combine  the  classification  results  from  each  section  and  output 
the  complete  classification  results. 


4.3  Boundary  Determination  in  Seismogram  Partition 

In  Huang  and  Fu  [hua83c],  boundaries  are  determined  by  visual 
inspection  to  partition  a  seismogram  into  small  sections  to  avoid  fre¬ 
quency  and/or  amplitude  attenuation  effect  in  the  detection  of  candi¬ 
date  bright  spot.  The  rule  of  boundary  determination  is  that  each  sec¬ 
tion  covers  some  entity  reflections,  i.e.,  two  or  three  more  reflection 
layers  close  together.  The  boundary  may  pass  through  the  reflection 
wavelets,  but  such  a  boundary  should  be  avoided.  The  problem  is  that 
boundary  determination  is  not  unique  and  is  only  subjective  by  visual 
inspection.  Now,  a  new  method  is  proposed  for  the  determination  of 
boundaries  and  a  more  objective  analysis  may  be  entailed.  The  boun¬ 
dary  will  pass  the  middle  point  between  two  signals.  The  real  seismo¬ 
gram  of  Mississippi  Canyon  is  used  in  the  analysis.  There  are  64  traces 
and  512  points  for  every  trace. 

Steps  of  processing  are  given  as  follows. 

Step  1:  Find  the  power  distribution  along  time  axis. 

Suppose  that  =  •  •  •  ,512j  =1,2,  •  •  •  ,64,  are  the  data 

values  in  the  seismogram,  where  i  is  the  index  of  time,  and  j  is  the 
index  of  seismic  trace.  For  every  time  i,  summation  of  the  64 
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horizontal  squared  data  values  represents  the  energy  of  reflection  at 
time  i.  (The  summation  looks  like  a  projection).  Then  the  summation 
divided  by  64  represents  the  power  at  time  i. 

— 2  a2(ili7-)=s(i),  "fc  =  1,2,  •  •  •  ,512 

From  s (i).'i  =  1,2, ...,5 12,  we  can  see  the  reflection  distribution  with 
respect  to  time  i.  This  also  has  the  meaning  of  probability  of  reflection 
at  time  i  or  the  depth  of  the  earth. 

The  high  amplitude  wavelets  are  called  "signal  plus  noise."  The  data 
with  low  values  are  called  'noise."  At  time  i,  if  only  noise  is  present  for 
J  >7  —  1  >2, . . .  ,64,  and  assumed  Gaussian  zero-mean,  then  s(i)  represents 
the  variance  of  the  noise.  At  horizontal  and  vertical  directions,  Gaus¬ 
sian  noise  is  assumed.  If  signal  plus  noise  is  present  at  high  amplitude 
wavelet,  then  s(i)  approximately  represents  the  power  of  the  signal. 

For  the  reflection  of  horizontal  layer,  the  distribution  of  s(i)  is 
more  concentrated  and  has  a  sharp  peak.  For  non-horizontal 
reflection,  i.e.,  slope  (dip)  reflection  or  complex  reflection  (example  of 
simulated  seimogram  in  Fig.  3.5B),  the  distribution  of  s(i)  is  widely 
spread. 

Step  2:  Threshold  to  separate  signal  plus  noise  and  noise  only 

Two  methods  are  considered  in  the  following: 

(1)  Set  a  threshold  which  is  equal  to  5  times  of  variance  of  the 
noise,  then  classify  signal  plus  noise  and  noise. 

In  order  to  calculate  the  distance  between  two  successive  signals,  a 
threshold  must  be  determined  to  separate  the  signal  plus  noise  and 
noise.  The  method  presented  in  this  Chapter  is  to  calculate  the 


122 


variance  a  of  the  noise  in  the  ls£  trace  of  the  seismogram  at  Missis¬ 
sippi  Canyon.  The  threshold  is  set  to  5a2.  The  square  root  of  threshold 
value  equals  to  V5a.  The  probability  of  noise  value  that  is  beyond  ±V5a 
is  small.  Above  the  threshold  the  signal  is  present. 

(2)  If  we  do  not  know  the  noise,  then  measure  the  maximum  of  the 
power  value  at  each  wavelet,  find  the  average  of  those  maximum  values 
and  set  a  threshold  which  is  equal  to  the  half  of  this  average. 

Step  3:  Find  the  distance  between  two  successive  signals. 

If  the  first  point  (i  =  l)  is  noise,  then  the  first  point  is  a  starting 
point.  From  the  starting  point  of  i,  scan  and  test  the  next  signal  loca¬ 
tion  (left  edge  of  signal  plus  noise),  then  calculate  the  distance  as  dj. 
Scan  the  right  edge  point  of  the  signal  plus  noise  (above  threshold)  as 
the  starting  point  and  repeat  the  same  procedure  to  calculate  the  suc¬ 
cessive  signal  distances  as  d2d3,  ■  •  •  ,dn. 

Step  4:  Find  the  threshold  for  successive  signal  distances. 

Display  the  distribution  of  and  determine  the  threshold  D  by 
inspection  from  histogram  or  by  K-means  clustering  algorithm  for  K=2 
[ImBOa,  swa78a,  tou74a].  The  reason  for  K=2  is  that  the  signal  distances 
are  two  classes,  i.e.,  the  class  of  long  signal  distance  and  the  class  of 
short  signal  distance.  From  clustering  result,  the  two  class  centers  are 
determined.  Threshold  D  is  the  average  of  two  class  centers. 

Step  5.  Determine  the  boundary  location. 

After  determining  the  threshold  D  from  Step  4,  compare  d±  with  D. 
If  di>Z>,  then  a  boundary  is  set  in  the  middle  of  the  two  signals,  i.e.,  the 

boundary  is  determined  by  the  position  of  starting  point  plus  If 
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di<D,  then  no  boundary  is  found.  Accumulate  the  distance  without 
boundary.  Continue  to  compare  d*  with  D.  A  constant  is  set  if  the 
scanned  distance  is  larger  than  this  constant,  for  example  0.5  seconds, 
then  the  boundary  is  automatically  set  at  the  middle  of  the  next  inter¬ 
val.  This  can  avoid  no  boundary.  Boundary  distance  of  0.5  seconds  is 
good  from  [str70a].  If  bounday  distance  is  large,  for  example  1.0 
seconds,  the  Q  effect  is  large  [str70a]. 

According  to  the  analysis  above,  the  threshold  can  be  set  by  using 

the  method  (1)  or  (2)  in  Step  2.  An  algorithm  is  presented  as  in  the  fol¬ 
lowings. 

Algorithm 4. 1 :  Determination  of  partitioning  boundary  in  a  seismogram 

Input:  An  m  points  seismogram  with  N  traces,  constant  c  and  t, 
where  c  is  5  times  of  noisy  variance,  i  =  0.5  seconds 

Output.  Location  of  the  partitioning  boundary 

Method: 

(1)  Find  the  power  distribution  along  time  axis. 

(2)  Set  a  threshold  c  to  classify  signal  plus  noise  and  noise  only. 

(3)  Find  the  distance  between  two  successive  signals.  The  distances 
are  d1dZi  •  •  •  ,  and  g^. 

(4)  Find  the  threshold  D  of  the  signal  distances  by  using  K-means 
clustering  algorithm  for  K=2.  The  threshold  is  the  average  of  two  class 
centers. 

(5)  Determine  the  boundary  location. 

Compare  d*  with  D.  If  di>D,  then  a  boundary  is  set  at  the  middle 
point  of  the  two  signals.  If  ,  then  no  boundary  is  found.  Accumu¬ 
late  the  distance  without  boundary.  Continue  to  compare  d*  with  D. 


124 


(6)  If  the  scanned  distance  is  larger  than  the  constant  f  =  0.5 
seconds,  then  the  boundary  is  automatically  set  at  the  middle  of  the 
next  interval.  Go  to  (5),  until  the  end  of  the  data  is  reached. 


4.4  Results  of  Boundary  Determination 

The  results  of  applying  Algorithm  4.1  to  the  real  seismogram  at  Mis¬ 
sissippi  Canyon  Fig.  4.2  are  as  follows. 

Fig.  4.3  is  the  result  of  Step  (1)  to  find  the  power  distribution  along 
time  i,  i  =  1,2,  •  •  •  ,512.  The  variance  of  the  noise  in  the  lsf  trace  of 
Mississippi  Canyon  is  0.0001972  =  o2.  5<t2=0. 000986=0. 001.  In  Step  (2), 
the  threshold  for  classifying  signal  plus  noise  and  noise  is  0.001.  Fig. 
4.4  is  the  result  of  the  Step  (3)  and  (4).  Signal  distance  d*  is  displayed 
and  the  threshold  D  is  calculated  as  D  =  52  by  K-means  clustering 
analysis  for  K=2.  From  Step  (5),  the  boundaries  are  determined  at 
time=0.408  seconds,  0.776,  1.164,  and  1.724  seconds.  Five  sections  are 
generated.  The  resulting  boundaries  are  shown  in  Fig.  4.2. 

Fig.  4.5A,  4.5B,  4.5C,  4.5D,  4.5E  are  the  feature  distribution  of  the 
testing  traces  for  the  first,  second,  third,  fourth,  and  fifth  sections 
respectively.  Fig.  4.6  is  the  feature  distribution  of  the  testing  traces 
for  the  whole  trace.  Fig.  4.5D  is  compared  with  Fig.  4.6.  The  overlap¬ 
ping  distribution  of  the  envelope  and  instantaneous  frequency  has  been 
separated  in  Fig.  4.5D.  The  distribution  of  envelope  and  instantaneous 
frequency  in  the  4 th  section  (Fig.  4.5D)  is  not  influenced  by  those  from 

other  sections.  The  tree  classifier  design  in  Fig.  4.5D  is  easier  than  that 
in  Fig.  4.6. 
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Figure  4.2  Relative  amplitude  seismogram  at  Mississippi  Canyon 
and  partitioning  boundaries  (5  sections) 
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Figure  4.3  Power  distribution 
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Figure  4.4  Distribution  of  signal  distance  Jfc* 
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Figure  4.5A  Feature  distribution  of  training  traces  (The  first  section) 
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Figure  4.5B  Feature  distribution  of  training  traces  (The  second  sectioi) 
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Figure  4.5C  Feature  distribution  of  training  traces  (The  third  section) 
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Figure  4.5D  Feature  distribution  of  training  traces  (The  fourth  section) 
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Figure  4.5E  Feature  distribution  of  training  traces  (The  fifth  section) 
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Figure  4.6  Feature  distribution  of  training  traces 
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4.5  Results  of  Partitioning-Method  and  Tree  Classification 
for  the  Detection  of  Candidate  Bright  Spot 

The  5  sections  are  tested  using  3  hypotheses  and  their  correspond¬ 
ing  tree  classification  techniques.  Testing  traces  are  the 
4th,\Zth ,20th ,ZQth ,36th ,44th ,5Zth ,and60f/i  traces.  From  the  feature 
distribution  of  envelope  and  instantaneous  frequency  in  the  testing 
traces  of  each  section,  only  the  4th  section  returns  the  candidate 
bright  spot  information  and  satisfies  the  first  hypothesis  in  Chapter  3, 
i.e.,  high  amplitude,  low  frequency,  and  negative  polarity.  After  classi¬ 
fying  the  64  traces  in  the  4th  section,  the  complete  classification  result 
is  shown  in  Fig.  4.7.  The  classification  result  using  tree  classification 
without  partition  (Chapter  3)  is  shown  in  Fig.  4.8.  Fig.  4.7  is  compared 
with  Fig.  4.8.  In  Fig.  4.7,  using  the  partitioning  method,  the  candidate 
bright  spot  is  located  at  1.4  seconds.  Compared  with  Fig.  4.2,  Fig.  4.7 
has  a  significant  improvement  on  the  classification  result  that  is  quite 
important  in  seismic  interpretation. 


4.6  Discussions 

(1)  The  major  advantage  of  the  proposed  partitioning-method  is 
that  the  overlapping  distributions  of  the  envelope  and  instantaneous 
frequency  in  different  sections  can  be  separated,  i.e.,  the  distribution 
of  envelope  and  instantaneous  frequency  at  one  section  is  not  affected 
by  those  from  other  sections.  The  classifier  is  easy  to  design. 

(2)  The  threshold  in  Step  2  to  classify  the  noise  and  signal  plus 
noise  is  equal  to  5  times  of  the  variance  of  noise  in  the  seismogram. 
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Figure  4.7  Tree  classification  result  of  bright  spot 
(Partitioning-method  and  tree  classification) 


Figure  4.8  Tree  classification  result  of  bright  spot 
(Tree  classification  without  partition) 
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(3)  The  determination  of  partitioning  boundary  along  the  horizontal 
direction  can  also  be  used  along  the  vertical  direction  or  other  direc¬ 
tions.  In  a  long  seismogram,  partitioning  a  seismogram  along  the  verti¬ 
cal  direction  is  needed.  If  the  reflection  layer  is  slanted,  then  it  will  be 
necessary  to  partition  the  seismogram  in  that  direction.  The  direction 
of  the  boundary  determination  is  dependent  upon  the  direction  of  the 
reflection  layers. 

(4)  Computer  time  for  the  processing  of  small  sections  is  shorter 
than  that  for  the  whole  seismogram. 

(5)  The  proposed  method  for  boundary  determination  is  more 
objective  than  visual  inspection  [hua83c].  This  proposed  method  can  be 
applied  to  long  and  wide  seismograms  as  well. 
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CHAPTER  V 

USE  OF  SYNTACTIC  PATTERN  RECOGNITION  FOR  THE  DETECTION 
OF  CANDIDATE  BRIGHT  SPOT 


5. 1  Introduction 

In  Chapter  3,  decision-theoretic  pattern  recognition  techniques 
have  been  applied  to  the  classification  of  Ricker  wavelets  and  the  detec¬ 
tion  of  candidate  bright  spot  [hua82a,  83a].  The  classifier  design  is 
based  on  the  good  separability  among  different  classes.  When  the 
inter-class  separability  is  poor,  classifier  design  in  decision-theoretic 
pattern  recognition  is  not  easy.  In  Fig.  5.1A  (sampling  interval=0.004 
seconds),  there  are  three  kinds  of  zero-phase  Ricker  wavelet  (without 
noise),  two  20  Hz  Ricker  wavelets  with  reflection  coefficient  -0.25  and 
-0.12  respectively,  and  one  17  Hz  Ricker  wavelet  with  reflection 
coefficient  -0.12.  From  the  scatter  diagram  of  envelope  and  instantane¬ 
ous  frequency  in  Fig.  5.  IB,  it  is  not  easy  to  design  a  classifier  to  classify 
these  three  Ricker  wavelets.  So  the  approach  of  syntactic  pattern 
recognition  is  proposed.  The  structural  information  of  Ricker  wavelets 
is  used.  Also  in  a  seismogram,  the  wavelets  of  a  bright  spot  have  some 
specific  structure.  So  syntactic  pattern  recognition  approach  is  pro¬ 
posed  for  the  classification  of  Ricker  wavelets  and  the  detection  of  can¬ 
didate  bright  spot. 
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Figure  5.1  A  Signal,  its  instantaneous  frequency,  and  envelope 


Cnwlope 


Figure  5.  IB  Feature  distribution  of  a  seismic  trace 
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A  syntactic  pattern  recognition  technique  is  first  used  for  the 
classification  of  Ricker*wavelets  in  a  simulated  seismic  trace.  Then,  a 
syntactic  pattern  recognition  technique  is  used  to  detect  the  candidate 
bright  spot  in  a  simulated  seismogram  as  well  as  real  seismograms  of 
Mississippi  Canyon  and  High  Island. 

A  block  diagram  of  syntactic  pattern  recognition  system  for  the 
classification  of  Ricker  wavelets  in  a  simulated  seismic  trace  is  shown  in 
Fig.  5.2A.  In  the  classification  of  Ricker  wavelets,  amplitude- 
independent  and  amplitude-dependent  encodings  of  optimal  quantiza¬ 
tion  are  used.  Many  techniques  of  peak  recognition  and  waveform 
analysis  can  be  found  in  [hor75a,  77a,  pav71a,  73a,  san79a,  shl82a].  In 
this  study,  Ricker  wavelets  is  represented  by  the  optimal  quantization 
encoding.  The  global  detection  is  to  detect  the  possible  wavelets. 
Levenshtein  distance  is  computed  between  the  possible  wavelet  string 
and  the  reference  strings  of  Ricker  wavelets.  The  local  detection  is  to 
extract  the  candidate  wavelet.  Minimum  distance  and  nearest-neighbor 
classification  rules  are  used. 

A  block  diagram  of  syntactic  pattern  recognition  system  for  the 
detection  of  candidate  bright  spot  is  shown  in  Fig.  5.2B.  In  the  detec¬ 
tion  of  candidate  bright  spot,  testing  traces  are  selected  from  the  input 
seismogram  and  tree  classification  techniques  are  used  in  the  detection 
of  candidate  bright  spot.  From  the  detected  candidate  bright  spot,  the 
sample  patterns  of  the  wavelets  are  extracted.  Amplitude-dependent 
encoding  with  optimal  quantization  is  also  used.  Using  the  probability 
of  detection,  a  threshold  is  set  to  detect  the  candidate  bright  spot. 
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Figure  5.2A  A  block  diagram  of  syntactic  pattern  recognition  system 
(For  classification  of  Ricker  wavelets  in  a  seismic  trace) 


1.41 


Figure  5.2B  A  block  diagram  of  syntactic  pattern  recognition  system 
(For  the  detection  of  candidate  bright  spot  in  a  seismogram) 
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5.2  Syntactic  Pattern  Recognition  for  the  Classification  of 
Ricker  Wavelets  in  the  Simulated  Seismic  Trace 


5.2.1  Optimal  Quantization  Encoding 

In  Fig.  5.4,  there  are  four  classes  in  the  simulated  seismic  trace. 

Cl:  r(t)=n(t)+20  Hz  Ricker  wavelet  with  reflection  coefficient  -0.18 
C2.  r(t)-n(t)+20  Hz  Ricker  wavelet  with  reflection  coefficient  -0.2960 
C3.  r(t)-n(t)  + 17  Hz  Ricker  wavelet  with  reflection  coefficient  -0.18 
C4:  r(t)=n(t)  Gaussian  10-60  Hz  noise  with  A/’(0,0.0242) 

Initially,  the  seismic  trace  is  encoded  by  amplitude-independent 
encoding  and  amplitude-dependent  encoding  of  the  optimal  quantiza¬ 
tion  encoding.  Assign  the  ith  pair  of  waveform  points 
[(xi>!/z).(a;i+i>2/i+i)]  to  the  symbol  cj*  connoting  the  slope  characteristic 
of  the  line  segment  joining  the  two  points.  Setting  di=yi+1-yi  .  There 

are  two  kinds  of  string  encoding  by  using  the  optimal  quantization 
encoding. 

(1)  Amplitude-independent  encoding 

The  primitives  are  as  follows. 
ot=p  ,  if  <^>0.002 
Uj=o  ,  if  — 0.002 <c£j  <0.002 
cJi=rL  ,  if  £^<—0.002 

The  constant  0.002  is  a  slope  tolerance  to  provide  a  zero-like  slope 
attribute.  This  kind  of  encoding  pattern  can  not  be  used  for  the 
classification  of  20 Hz  Ricker  wavelets  with  different  amplitudes, 
because  the  encoded  strings  are  all  the  same.  Using  these  primitives, 

only  two  kinds  of  Ricker  wavelets  can  be  classified,  i.e.,  20  and  17  Hz 
Ricker  wavelets. 
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Figure  5.3  20  Hz  and  17  Hz  Ricker  wavelets  without  noise 
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Figure  5.4  20  Hz  and  17  Hz  Ricker  wavelets  with  Gaussian  noise 


Output  value 


Figure  5.5  Classification  result  of  two  classes  (for  the  amplitude- 
independent  encoding  and  the  minimum  distance  classification  rule) 
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Figure  5.6  Classification  result  of  two  classes  (for  the  amplitude- 
independent  encoding  and  the  NN  decision  rule) 
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Figure  5.7  Classification  result  of  three  classes  (for  the  amplitude- 
dependent  encoding  and  the  minimum  distance  classification  rule) 
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(2)  Amplitude-dependent  encoding 

For  the  amplitude-dependent  encoding,  the  assignment  of 
di-Vi+\-yi  to  a  symbol  is  a  quantization  problem.  From  the  analysis  of 
the  central  part  of  17 Hz  Ricker  wavelet,  the  amplitudes  of  the  13  data 
points,  yit  i  =  1,2,  •  •  •  ,13  are 

-.0796,  -.0737,  -.0399,  .0213,  .0952,  .1563,  .1800,  .1563, 

.0952,  .0213,  -.0399,  -.0737,  -.0796. 

The  13  data  points  are  indicated  in  Fig.  5.3.  The  distribution  of  the 
differences  di=yi+i—yit  i  =  l,2,  .  .  .  ,12,  is  a  Gaussian  distribution  which 
can  be  proved  by  a  xZ  ~  test  [net82a].  From  the  analysis  of  the  central 
part  of  20 Hz  Ricker  wavelet,  the  amplitudes  of  the  same  13  data  points, 
yit  i  =  1,2,  •  •  •  ,13  are 

-.1317,  -.1100,  -.0230,  .1137,  .2428,  .2960,  .2428,  .2960, 

.2428,  .1137,  -.0230,  -.1100,  -.1317. 

The  13  data  points  are  indicated  in  Fig.  5.3.  The  distribution  of  the 
differences  di=yi+1—yi,  i  =  l,2,  •  •  •  ,12,  is  a  Gaussian  distribution  which 
can  be  proved  by  a  x2  ■  test  [netS2a].  The  number  of  selected  data 
points  is  not  fixed,  since  it  depends  on  the  shape  of  the  represented 
wavelet. 

The  optimal  quantization  of  8  levels  for  the  Gaussian  samples  is 
used  [gag78a].  From  the  experiments  in  this  chapter,  if  the  standard 
deviation  <rs  of  the  di  of  the  signal  is  larger  than  1.5crn  of  the  d*  of  the 
noise,  then  8-level  optimal  quantization  is  good.  For  the  number  of 
quantization  levels  is  larger  than  8,  for  example  16,  the  interval  of 
quantization  is  smaller  and  the  representation  of  the  signal  without 
noise  is  more  accurate  than  the  representation  of  using  8-level  quanti¬ 
zation.  For  large  noise,  the  quantization  will  be  very  sensitive.  For  the 
number  of  quantization  levels  is  less  than  8,  for  example  4  or  2,  the 


... 
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interval  of  quantization  is  large,  different  signal  may  be  quantized  as 
the  same  representation.  It  is  enough  to  use  8-level  quantization  for 
high  S/N  in  the  later  analysis.  The  primitives  are  as  follows. 


Ui  =  d  for  1.76a 

c8=2.15ct 

Ui=c  for  1.05u<oii<  1.76a 

c7=  1.34a 

CJ i=b  for  0.5CT<di<l. 05cr 

c6=0.75(7 

cj^  =  a  for  0.0a<c£i<0.5a 

c  5=0.24cr 

cj i=A  for  -0.5a<di<0.0cr 

c  4= -0.24cr 

Ui=B  for  —1.05<7<dj<— 0.5a 

c3=— 0.75a 

cJi  =  C  for  —  1.76a<dj<  — 1.05ct 

c2=  —  1.34a 

cj i=D  for  dt<— 1.76a 

c  !  =— 2. 15a 

where  a  is  the  standard  deviation  of  d^=yi+i—yi  distribution  for  the 
wavelet  and  ci,i  =  l,2,  •  •  •  ,8  are  the  conditional  sample  mean  value  of 
each  interval.  Using  these  primitives,  three  kinds  of  Ricker  wavelet  can 
be  classified.  a=0. 051661  of  d ^  distribution  is  selected  from  the  smal¬ 
lest  a  for  17  Hz  Ricker  wavelet  and  the  8-level  optimal  quantization  is 
determined. 

The  amplitude-independent  and  amplitude-dependent  encoded 
strings  for  the  central  part  of  Ricker  wavelets  without  noise  are  listed 
in  Table  5.1.  These  strings  are  used  as  the  reference  strings  of  the 
three  kinds  of  Ricker  wavelet  and  represented  by  xi  ,i  =  1,2,3  .  x1 
represents  the  string  of  20 Hz  Ricker  wavelet  with  reflection  coefficient 
-0.18.  x2  represents  the  string  of  20 Hz  Ricker  wavelet  with  reflection 
coefficient  -0.2960.  x3  represents  the  string  of  17 Hz  Ricker  wavelet 
with  reflection  coefficient  -0.18.  Their  lengths  are  equal  in  the  compu¬ 
tation  of  Levenshtein  distance. 
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Table  5. 1  String  of  Ricker  wavelets 


Amplitude-Independent 

Amplitude-Dependent 

20  Hz  Ricker  wavelet 
with  reflection 
coefficient  -  0. 18 

np5n5p 

Aab  c  c  bBCCBAa 

20  Hz  Ricker  wavelet 
with  reflection 
coefficient  -  0.2960 

np5n5p 

Aac  ddbBDD  CAa 

17  Hz  Ricker  wavelet 
with  reflection 
coefficient  -  0.18 

p6n6 

abcccaA  CCCBA 
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5.2.2  Likelihood  Ratio  Test  (LRT) 

Suppose  that  an  independent,  identical  Gaussian  noise 
ni>i  =  1.2,  ’  •  •  ,N  with  zero  mean  and  variance  a2  is  added  to  the  17  Hz 
Ricker  wavelet,  yi.i  =  l,2,  •  •  •  ,N,  then,  cii=2/i  +  1+ni+1-(?/i+ni) 
~yi+i~~yi  +71i  +  i"~ni  .  So  di  is  a  Gaussian  noise  with  mean  l/t  +  i  — i/*  ,  and 
variance  2o2.  If  only  a  Gaussian  noise  is  present,  then  d^n^-n*  is 
also  Gaussian  with  N(0,2o2). 

From  Section  5.2.1,  the  distribution  of  the  differences  di=yi+1-yi, 

x  =  l,2,  •  •  •  ,N ,  of  17  Hz  Ricker  wavelet  is  a  Gaussian  distribution  with 
N{0,a2). 

Suppose  that  the  signal  and  the  noise  are  independent  Gaussian.  In  the 
detection  problem,  there  are  two  hypotheses.  One  is  the  signal  plus 
noise,  Af(0,os+o2)  ,  the  other  is  the  noise  only,  Af(0,o2)  .  Then  the  likel¬ 
ihood  ratio  test  [van68a]  is  shown  in  the  following. 

Hq'.  Gaussian  noise  P{r  )= —  ^ - exp  [ — — — -1 

2a2 

H i:  Signal  +  noise  P{r)=—~ — exp[:=~- ]  ,  where  of=o2  +  os2 

V271  Oj  2of  ins 

The  Likelihood  Ratio  Test  is  that 


A(r)  = 


pjrVHj) 

P(r/H  0) 


-r  2 


exP[-7T-( 


of 


1 


o; 


-)] 


If  A(r )>7j  ,  then  Hx  is  true. 

If  A(r  )<T]  ,  then  H0  is  true. 

Taking  In  on  both  sides  and  rearranging  the  terms,  we  obtain  the  follow¬ 
ing  result. 
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ln77  — ln(  — 

Let  - — — =(32 

O'  P  P  / 


2  a. 


<7? 


If  rz>(Sz  ,  then  //j  is  true. 
If  rz<ftz  ,  then  H0  is  true. 


i.e., 

t 

If  |r  |  >/3  ,  then  //j  is  true. 

If  |r  |  </?  ,  then  H0  is  true. 

Suppose  that  H0  and  H x  are  equal  probable,  then  77=  1  is  used  in  the 
experiments  of  this  chapter. 


For  the  17  Hz  Ricker  wavelet,  crs  =  0.051661  ,  crn  =  0.033941  and 
aj=0. 061813.  Substituting  crn  and  Oj  into  LRT  formula  above,  then  a  sig¬ 
nal  is  present,  if  r>0. 0444542,  or  r<-0. 0444542  . 

In  the  following,  global  and  local  detections  are  for  the  amplitude- 
dependent  encoding. 


5.2.3  Global  Detection 

For  an  input  seismic  string,  the  wavelets  must  be  detected  and 
extracted.  So  a  global  detection  is  proposed  to  detect  the  possible 
wavelets.  Combining  the  likelihood  ratio  test  (LRT)  and  the  8-level 
optimal  quantization  of  17  Hz  Ricker  wavelet,  a  signal  is  detected  if 
r>0. 054244=  1.05as ,  or  r<>— 0.054244=  —  1.05crs  ,  i.e.  the  region  of  the 
symbols  ’c\  ’d\  ’C\  and  ’D*  in  Fig.  5.8A.  The  probability  of  detecting 
signal  is  0.2938.  For  every  input  string  of  12  symbols, 
12*0.2938=3.5256.  Truncated  3.5256=3.  The  approximated  number  of 
detected  symbol  is  3.  So  a  threshold  is  set,  if  the  number  of  symbols, 
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Figure  5.8A  8-level  optimal  quantization  encoding  for  d*  of 
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Figure  5.8B  Probability  of  detection  for  ^  of  17  Hz  Ricker  wavelet  with 
amplitude-dependent  encoding 
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’c\  ’d\  ’C’,  or  ’D’,  is  equal  to  or  larger  than  3,  then  a  possible  wavelet  is 
detected.  Otherwise,  input  the  next  string.  The  purpose  of  this  global 
detection  is  to  extract  the  possible  wavelet  and  to  reduce  the  string 
distance  computation  time. 


5.2.4  Levenshtein  Distance 

Levenshtein  distance  between  two  strings  [lev66a]  is  defined  as  the 
minimum  number  of  symbol  insertions,  deletions,  and  substitutions 
required  to  transform  one  string  into  the  other  string.  The  Levenshtein 
distance  between  z  and  y  denoted  as  dL(x,y )  is 
dL  (x  ,y)=Min[kj  +mj +TLj  ] 

where  kj  ,m j ,  and  nj  are  the  number  of  substitution,  deletion,  and 
insertion  transformations  respectively.  This  distance  can  be  computed 
using  a  string-to-string  correction  algorithm  [wag74a]. 


5.2.5  Local  Detection 

The  local  detection  for  amplitude-dependent  and  amplitude- 
independent  encoding  is  discussed  in  the  following. 

(I)  Amplitude-dependent  encoding 

Levenshtein  distance  between  the  possible  wavelet  string  and  the 
reference  strings  of  Ricker  wavelets  is  computed,  a  threshold  is  set  to 
detect  whether  or  not  the  possible  wavelet  is  a  candidate  wavelet.  The 
threshold  is  calculated  from  the  probability  of  detecting  every  encoded 
symbol.  From  the  analysis  of  17 Hz  Ricker  wavelet  (the  smallest 
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variance),  c li-yi+1  yi  falls  into  one  quantization  interval  of  8  intervals 
and  corrupted  by  Gaussian  noise  N( 0,2a*).  So  the  probability  of  detec¬ 
tion  in  this  quantization  interval  can  be  calculated  from  Figure  5.8.  For 
12  di  of  17  Hz  Ricker  wavelet,  the  probability  of  detecting  every  d*  can 
be  calculated  by  using  the  statistical  table  of  normal  distribution.  The 
sum  of  these  12  detection  probabilities  is  4.1556. 

(2*(.2875+.28l9+.3167+.3906  +  .3906+.4105)=4. 1556) 

Truncated  4.1556  =  4. 

The  approximated  number  of  detection  symbols  is  4.  The  number  of 
missing  and  false  symbols  is  12-4=8,  i.e.,  error  or  un-detected  symbols. 
From  the  definition  of  Levenshtein  distance,  if  Levenshtein  distance 
between  an  input  string  of  17  Hz  Ricker  wavelet  corrupted  by  Gaussian 
noise  and  a  reference  string  of  17  Hz  Ricker  wavelet  without  noise  is 
computed,  then  the  input  string  belongs  to  the  reference  string  if 
Levenshtein  distance  is  less  than  8  symbols.  So  8  is  selected  as  a  thres¬ 
hold.  The  threshold  is  set  7  in  the  experiment.  The  same  calculations 
are  for  the  20 Hz  Ricker  wavelets.  For  an  input  string  y ,  if  the  distance 
^  (z  >2/)^7,  or  dL(xz,y)<7 ,  or  dL(x3,y)<,7,  then  the  minimum  distance 
and  the  nearest-neighbor  classification  rules  are  used.  Otherwise,  input 
the  next  interval  string  and  repeat  the  process.  The  purpose  in  setting 
this  threshold  is  to  extract  true  wavelets  for  the  classification.  The 
number  of  the  detected  symbols  is  dependent  on  the  location  of 
^i=yi  +  \~Vi  and  the  variance  of  Gaussian  noise. 

(II)  Amplitude-independent  encoding 

For  the  amplitude-independent  encoding  experiment,  Levenshtein 
distance  between  every  input  string  and  the  reference  strings  of  Ricker 
wavelets  is  computed.  From  Section  5.2.1,  three  intervals  are  for 
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symbols  ’n’,  ’o’,  and  ’p’.  The  difference  of  17  Hz  Ricker  wavelet  falls 
into  p  or  n’.  Then,  this  is  corrupted  by  the  Gaussian  noise 
7V(0,2*o|),  di=(yi+1-yi)+^Li+1-ni),  where  nit  i- 1,2,  ■  •  •  ,N  are  Gaus¬ 
sian  independent  with  JV(0,a®).  Then  the  probabihty  of  detecting  this 
di  can  be  calculated.  For  the  12  d*  of  the  17  Hz  Ricker  wavelet,  the 
probability  of  detecting  every  d*  can  be  calculated  by  suing  the  statisti¬ 
cal  table  of  normal  distribution.  (Fig  5.9).  The  sum  of  these  12  detec¬ 
tion  probabilities  is  10.0692. 

(2*(. 545+. 789+. 8264+.  9706  +  .9706  +  . 9830)  =  10. 0692) 

Truncated  10.0692  =  10. 

The  approximated  number  of  detection  symbols  is  10.  The  number  of 
missing  and  false  symbols  is  12-10=2,  i.e.,  error  or  un-detected  sym¬ 
bols.  The  input  string  belongs  to  the  reference  string  if  Levenshtein 
distance  is  less  than  2  symbols.  So  2  is  selected  as  a  threshold.  The 
same  calculations  are  for  the  20 Hz  Ricker  wavelets.  The  threshold  is 
set  to  2  to  extract  the  Ricker  wavelets  in  this  experiment.  The  central 
parts  of  the  20 Hz  Ricker  wavelets  with  reflection  coefficient  -0.18  and 
-0.2960  respectively  are  encoded  as  the  same  string.  x1=x2  .  If  the  dis¬ 
tance  dL{x\y)< 2,  or  dL(x3,y)< 2,  then  the  candidate  wavelet  is 
extracted  and  the  minimum  distance  classification  rule  is  used.  Other¬ 
wise,  input  the  next  12-symbol  string. 


5.2.6  Classification  Rules  and  Results 

The  classification  methods  presented  here  are  the  minimum- 
distance  and  the  nearest-neighbor(NN)  classification  rules.  The  advan¬ 
tage  of  these  rules  is  the  fast  computation. 
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Figure  5.9B  Probability  of  detection  for  d*  of  17 Hz  Ricker 
wavelet  with  amplitude-independent  encoding 
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The  definition  of  the  minimum  distance  and  the  NN  classification 
rules  are  as  follows. 

a)  Minimum-distance  classification  rule 

There  are  3  reference  strings  of  Ricker  wavelets,  x1,®2,  and  x3.  y 
belongs  to  ojt  if  dL(x3'  ,y)=Min[dL(xi  ,y)] 

i 

b)  NN  classification  rule 

There  are  two  classes  of  Ricker  wavelet,  i.e.,  20  and  17  Hz  only,  Uj 
and  cj2.  X 1  and  Xz  are  the  reference  string  sets  for  20  and  17  Hz  Ricker 
wavelets, 

where  Xx  =  [x\ ,x£],  Xz=[xf].  dL (X1  ,y)=Min[dL(xfc,y)]. 

k 

y  belongs  to  ajt  if  dL (xj ,y)=Min[dL (X1  ,y)] 

i 

The  two-  and  three-  class  classification  problems  are  stated  in  the 
following. 

(I)  Two-class  classification:  If  the  amplitude-independent  encoding 
is  used  in  the  primitive  recognition,  then  the  central  parts  of  the  20 Hz 
Ricker  wavelets  with  reflection  coefficient  -0.18  and  -0.2960  respectively 
are  encoded  as  the  same  string.  xx-xz  .  So  there  are  two  classes  in 
the  seismic  trace,  i.e.,  20  and  17  Hz  Ricker  wavelets  and  the 
minimum-distance  classification  rule  is  used.  The  classification  result 
is  shown  in  Fig.  5.5.  Two  classes  are  in  Fig.  5.5,  one  is  the  20Hz  Ricker 
wavelet,  the  other  is  the  YlHz  Ricker  wavelet.  If  the  amplitude- 
dependent  encoding  is  used  in  the  primitive  recognition,  x1  and  x2  are 
considered  as  one  class  set.  There  are  two  classes  and  the  NN- 
classification  rule  is  applied.  The  classification  result  is  shown  in  Fig. 
5.6.  Two  classes  are  in  Fig.  5.6,  one  is  the  20 Hz  Ricker  wavelet,  the 
other  is  the  17/7z  Ricker  wavelet.  Compared  with  the  original  seismic 
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trace  in  Fig.  5.4,  the  result  of  using  the  amplitude-dependent  encoding 
is  better  than  that  of  using  the  amplitude-independent  encoding. 
Because  in  Fig.  5.5,  the  result  at  the  20 Hz  Ricker  wavelet  with 
reflection  coefficient  -0.18  has  two  class  results. 

(II)  Three-class  classification:  The  amplitude-dependent  optimal 
quantization  encoding  is  used  and  the  minimum-distance  classification 
rule  is  applied.  Compared  with  the  original  seismic  trace  in  Fig.  5.4,  the 
classification  result  shown  in  Fig.  5.7  is  quite  good.  The  classification 
result  in  Fig.  5.7  is  the  17 Hz  Ricker  wavelet  and  the  20 Hz  Ricker 
wavelets  with  amplitude  -0.18  and  -0.2960  respectively. 


5.3  Syntactic  Pattern  Recognition  for  the  Detection 
of  Candidate  Bright  Spot  in  a  Simulated  Seismogram 

In  Fig.  5. 10A,  there  are  three  classes  in  the  simulated  seismogram. 
Cl:  r(t)=n(t)+20  Hz  Ricker  wavelet  with  reflection  coefficient  -0.29598 
at  top  of  gas  sand  zone  with  polarity  reversal. 

C2.  r(t)-n(t)+30  Hz  Ricker  wavelet  from  other  boundaries. 

C3:  r(t)=n(t)  Gaussian  10-60  Hz  white  noise 

This  is  the  second  hypothesis  of  the  three  hypotheses  described  in 
Chapter  3.  The  candidate  bright  spot  is  the  one  with  high  amplitude 
and  low  frequency  content.  In  processing  this  simulated  seismogram, 
the  data  are  not  known  as  candidate  bright  spot  or  non-candidate 
bright  spot.  Testing  traces  are  randomly  selected  from  the  input 
seismogram.  Tree  classification  techniques  [hua83a]  are  used  in  the 
detection  of  candidate  bright  spot.  Envelope,  instantaneous  frequency, 
and  polarity  are  used  as  the  features.  Three  hypotheses  (Chapter  3) 
are  the  constrained  conditions  for  the  detection  of  candidate  bright 
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Figure  5. 10A  Synthetic  seismogram  of  bright  spots 
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Figure  5. 10B  Testing  traces 
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Figure  5.  IOC  Feature  distribution  of  testing  traces 
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Figure  5.10D  Tree  classification  result  of  bright  spots 
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Figure  5.10E  Classification  result  of  bright  spot  (Threshold  =  1) 
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Figure  5.10F  Classification  result  of  bright  spot  (Threshold  =  2  ) 
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spot.  From  the  detected  candidate  bright  spot  on  the  28 th  and  36th 
traces,  two  sample  patterns  of  the  Ricker  wavelet  with  noise  are 
extracted.  The  distribution  of  on  the  extracted  samples  (signal  plus 
noise)  is  a  Gaussian  N(0,af)  ,  (^=0. 100853.  The  distribution  of  d*  for 
noise  only  on  the  32 th  trace  is  a  Gaussian  7V(0,ct2)  ,  an  =  0.012.  Suppose 
that  both  signal  and  noise  are  independent  Gaussian,  then  crf=o^+os2. 
as  =  0.0998646  .  The  8-level  optimal  quantization  encoding  is  determined 
by  using  crs.  So  two  extracted  pattern  samples  can  be  encoded  as 
strings  of  candidate  bright  spot  patterns  and  have  10  symbols  respec¬ 
tively  here  (Table  5.2).  The  input  seismogram  is  encoded  also. 

(1)  Likelihood  Ratio  Test  (LRT) 

From  the  likelihood  ratio  test,  the  threshold  for  signal  and  noise  is 
determined.  A  signal  of  20 Hz  Ricker  wavelet  is  present  if  r>0. 0249224 
or  r<-0. 0249224  . 

(2)  Threshold  for  global  detection 

Comparing  the  8-level  optimal  quantization  with  likelihood  ratio 
test,  the  closest  levels  to  /3= ±0.0208826  are  the  end  points  at 
±0.0249277=±0.5cf5  .  (Fig.  5.11).  Then,  ±0.0249277=±0.5crs  are  selected 
as  the  new  threshold.  The  areas  above  0.0249277  and  below  -0.0249277 
are  the  detected  areas  of  signal,  i.e.,  the  intervals  of  ’b’,  ’c’,  ’d’,  ’B’,  'C’, 
and  ’D’.  The  probability  of  detecting  ’b’,  ’c’,  ’d’,  ’B’,  ’C’,  and  ’D’  is  0.617. 
10*0.617=6.17.  Truncated  6.17=6.  The  approximated  number  of 
detected  symbols  at  ’b’,  ’c’,  ’d’,  ’B’,  ’C’,  and  ’D’,  is  6  symbols.  So  a 
threshold  is  set,  if  the  number  of  symbol,  ’b’,  ’c’,  ’d’,  ’B’,  ’C’,  or  ’D’,  is 
equal  to  or  larger  than  6,  then  a  possible  wavelet  is  detected.  Other¬ 
wise,  input  the  next  10  symbol  string. 
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Table  5.2  The  Encoded  String  of  the  Extracted  Patterns 


String 

Representation 

[l]  For  Simulated  Seismogram 

(1)  Pattern  from  the  28th  trace 

(2)  Pattern  from  the  36th  trace 

abccbBCCBA 

abccbACCBA 

[2]  For  Mississippi  Canyon 

(1)  Pattern  from  the  13th  trace 

(2)  Pattern  from  the  21st  trace 

(3)  Pattern  from  the  29th  trace 

(4)  Pattern  from  the  37th  trace 

BDCbcddB 

ABBAAbdc 

BDCaddbc 

ACCAcddc 

[3]  For  High  Island 

(1)  Pattern  from  the  45th  trace 

(2)  Pattern  from  the  50th  trace 

(3)  Pattern  from  the  51st  trace 

(4)  Pattern  from  the  53rd  trace 

BBAbccaBBA 

BBBacdbBCB 

BBAbdcaCCB 

BBAbdcaBBA 
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Figure  5.11A  8-level  optimal  quantization  encoding  for  d*  of 
bright  spot  pattern  in  simulated  seismogram 


Figure  5.UB  Probability  of  detection  using  the  conditional  mean 
of  each  interval  in  bright  spot  pattern 
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(3)  Threshold  for  the  detection  of  candidate  bright  spot 

For  the  detection  of  candidate  bright  spot  in  simulated  seismogram 
and  real  data  experiment,  there  are  no  non-candidate  bright  spot 
strings.  After  the  Levenshtein  distance  between  a  possible  wavelet 
string  and  the  extracted  string  of  candidate  bright  spot  patterns  is 
computed,  a  threshold  k  is  set  to  detect  whether  or  not  the  possible 
wavelet  is  a  candidate  bright  spot  wavelet.  The  process  is  similar  to  the 
local  detection  in  the  classification  of  Ricker  wavelets  described  above. 
Suppose  that  x  is  a  candidate  bright  spot  string,  if  dL  (x  ,y)<k ,  then  y  is 
the  detected  candidate  bright  spot  string. 

The  exact  amplitude  of  the  signal  is  not  known  because  the  signal  is 
corrupted  by  noise.  The  probability  of  detection  can  not  be  calculated 
as  that  for  the  17  Hz  Ricker  wavelet  described  in  Section  5.2.5.  An 
alternative  method  is  proposed.  Using  the  8-level  optimal  quantization, 
the  dt  value  is  quantized  to  the  quantized  value  ,  the  conditional 
mean  of  each  quantization  interval.  Assume  that  Gaussian  noise  is 
added  to  the  quantized  value  ci  ,  then  the  probability  of  detection  in 
the  quantized  interval  can  be  calculated.  For  20  ci  of  the  quantized 
value  of  the  candidate  bright  spot,  the  probability  of  detecting  every 
can  be  calculated  by  using  the  statistical  table  of  normal  distribution. 
The  sum  of  these  20  detection  probabilities  is  19.5684. 

(There  are  5  ’a’  or  ’A’,  7  ’b’  or  ’B\  and  8  ’c'  or  ’C'. 

5  *.96 18+ 7*.  975+ 8  *.99 18= 19. 5684) 

For  string  of  10  symbols,  19.5684/2=9.7842.  Truncated  9.7842  =  9. 

The  approximated  number  of  detected  symbols  is  9.  The  number  of 
missing  and  false  symbols  is  10-9=1,  i.e.,  error  or  un-detected  symbols. 
The  input  string  represents  a  candidate  bright  spot  if  the  Levenshtein 
distance  is  less  than  1  symbol.  So  1  is  selected  as  a  threshold.  The 
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threshold  is  set  1  or  2  to  detect  the  candidate  bright  spot  in  the  experi¬ 
ment.  A  typical  classification  result  of  using  threshold  1  is  shown  in  Fig. 

5.10E  and  a  typical  classification  result  of  using  threshold  2  is  shown  in 
Fig.  5.1  OF. 


5.4  Syntactic  Pattern  Recognition  for  the  Detection  of 
Candidate  Bright  Spot  in  the  Real  Data  Experiment 

Two  relative  amplitude  real-data  seismograms  are  processed. 

(I)  Experiment  at  Mississippi  Canyon 

In  Fig.  5.12A,  at  0.2  and  1.4  seconds,  the  dominant  wavelets  are  the 
minimum-phase  wavelets.  Testing  traces  are  randomly  selected  and 
tree  classification  techniques  are  used  [hua83a].  The  selected  samples 
of  detected  candidate  bright  spot  are  on  the  13£h,  21sf ,  29 th,  and  37 th 
traces  (Fig.  5.12B).  The  smallest  variance  of  d ^  distribution  on  the  21f/i 
trace  is  used  in  the  analysis  of  signal  and  noise.  The  distribution  of  d* 
at  the  extracted  samples  (signal  plus  noise)  of  the  21sf  trace  is  Gaus¬ 
sian  with  j¥(0,cjf)  and  aj=0. 055478.  The  distribution  of  d ^  for  noise  only 
on  the  lsf  trace  is  Gaussian  with  A/’(0,cr^)  and  an  =  0.014043.  Suppose 
that  both  signal  and  noise  are  independent  Gaussian,  then  af=cr^  +  a ^ 
and  as -0.053673  .  The  8-level  optimal  quantization  encoding  is  deter¬ 
mined  by  using  crs  .  So  four  extracted  pattern  samples  can  be  encoded 
as  strings  of  candidate  bright  spot  patterns  and  each  string  has  8  sym¬ 
bols  (Table  5.2).  The  pattern  string  of  the  candidate  bright  spot  in  the 
21th,  trace  is  ABBAAbdc .  The  input  seismogram  is  encoded  also. 


(1)  Likelihood  Ratio  Test  (LRT) 
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Figure  5.12A  Real  seismogram  at  Mississippi  Canyon  (Negative  on  the 
right) 
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Figure  5. 12B  Tree  classification  result  of  bright  spots 
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Figure  5.12C  Classification  result  of  bright  spots 
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From  the  likelihood  ratio  test,  the  threshold  for  signal  and  noise  is 
determined.  Signal  is  present,  if  r>0. 0240482,  or  r<— 0. 0240482  . 

(2)  Threshold  for  global  detection 

Comparing  the  8-level  optimal  quantization  with  LRT,  the  closest 
levels  to  jS=±0. 0240482  are  the  end  points  at  ±0.0268365=±0.5(7S .  (Fig. 
5.13).  Then,  ±0.0268365=±0.5crs  are  selected  as  the  new  threshold.  The 
areas  above  0.0268365  and  below  -0.0268365  are  the  detected  areas  of 
the  signal,  i.e.,  the  intervals  of  ’b’,  ’c’,  ’d’,  ’B’,  ’C’,  and  ’D’.  The  probabil¬ 
ity  of  detecting  ’b\  ’c',  ’d\  ’B\  ’C\  and  ’D’  is  0.617.  8*0.617=4.936. 
Trumcated  4.936  =  4.  The  approximated  number  of  detected  symbols 
at  ’b’,  ’c’,  ’d’,  ’B’,  ’C’,  and  ’D’,  is  4  symbols.  Here  a  threshold  is  set,  if 
the  number  of  symbol,  ’b’,  ’c’,  ’d’,  ’B’,  ’C’,  or  ’D’,  is  equal  to  or  larger 
than  5,  then  the  possible  wavelet  is  detected.  Otherwise,  input  the  next 
8  symbol  string. 

(3)  Threshold  for  the  detection  of  candidate  bright  spot 

The  exact  amplitude  of  the  signal  is  not  known  because  the  signal  is 
corrupted  by  noise.  The  probability  of  detection  can  not  be  calculated 
as  that  for  the  17  Hz  Ricker  wavelet  described  in  Section  5.2.5.  An 
alternative  method  is  proposed.  Using  the  8-level  optimal  quantization, 
the  value  is  quantized  to  the  quantized  value  ,  the  conditional 
mean  of  each  quantization  interval.  Assume  that  Gaussian  noise  is 
added  to  the  quantized  value  ci  ,  then  the  probability  of  detection  in 
the  quantized  interval  can  be  calculated.  For  8  of  the  quantized 
value  of  the  candidate  bright  spot,  the  probability  of  detecting  every  ci 
can  be  calculated  by  using  the  statistical  table  of  normal  distribution. 
The  sum  of  these  8  detection  probabilities  is  5.7354. 
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Figure  5.13A  8-level  optimal  quantization  encoding  for  d*  of 


bright  spot  pattern  in  Mississippi  Canyon 


Figure  5.13B  Probability  of  detection  using  the  conditional  mean 
of  each  interval  in  bright  spot  pattern 
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There  are  3  ’A’,  3  ’b’  or  ’B’,  1  ’c\  and  1  ’d’. 

3*0.66+3*.704+ 1*.71 15+1  *.93 19=5. 7354.  Truncated  5.7354  =  5. 

The  approximated  number  of  detected  symbols  is  5.  The  number  of 
missing  and  false  symbols  is  8-5=3,  i.e.,  error  or  un-detected  symbols. 
The  input  string  belongs  to  the  string  of  candidate  bright  spot  if 
Levenshtein  distance  is  less  than  3  symbols.  So  3  is  selected  as  a  thres¬ 
hold.  Suppose  that  x\  x2,  x3,  and  x4  are  candidate  bright  spot  strings, 
if  dL{x\y)<klt  or  dL{x2,y)<kz,  or  dL{x3,y)^kz,  or  dL(x4,y)<k4,  then  y 
is  the  detected  candidate  bright  spot  string.  The  threshold  values  of 
&2>  ^3>  and  k4  are  3  in  this  experiment.  The  typical  classification 
result  is  shown  in  Fig.  5.12C.  Compared  with  the  original  seismogram  in 
Fig.  5.12A,  Fig.  5.12C  is  good. 

(II)  Experiment  at  High  Island 

In  Fig.  5.14A,  at  1.7  seconds,  the  dominant  wavelets  are  the  zero- 
phase  wavelets.  Testing  traces  are  randomly  selected  and  tree 
classification  techniques  are  used.  The  detected  candidate  bright  spot 
of  testing  traces,  the  45£A  and  53rd,  are  at  both  ends  of  the  major 
reflection  layer,  bo  the  50 th  and  51sf  traces  in  the  middle  of  the  major 
reflection  layer  are  also  selected  as  the  testing  traces  (Fig.  5.14B).  The 
standard  deviations  of  d ^  from  4  extracted  patterns  are  very  close.  The 
distribution  of  d^  in  the  extracted  samples  (signal  plus  noise)  is  Gaus¬ 
sian  with  N(0,a?)  and  u^O.05125.  The  distribution  of  d*  for  noise  only 
on  the  1st  trace  is  Gaussian  with  Af(0,a^)  and  an  =  0.01 1862.  Suppose 
that  both  signal  and  noise  are  independent  Gaussian,  then  a^=a2+a2 
and  as  =0.0498554  .  The  8-level  optimal  quantization  encoding  is  deter¬ 
mined  by  using  <rs.  So  four  extracted  pattern  samples  can  be  encoded 
as  strings  of  candidate  bright  spot  patterns  and  each  string  has  10 
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Figure  5.14A  Real  seismogram  at  High  Island  (Negative  on  the  right) 
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Figure  5. 14B  Tree  classification  result  of  bright  spots 
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Figure  5.14C  Classification  result  of  bright  spots 
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symbols  (Table  5.2).  The  input  seismogram  is  encoded  also. 


(1)  Likelihood  Ratio  Test  (LRT) 

From  the  likelihood  ratio  test,  the  threshold  for  signal  and  noise  is 
determined.  Signal  is  present,  if  r>0. 0208826,  or  r<-0. 0208826  . 


(2)  Threshold  for  global  detection 

Comparing  the  8-level  optimal  quantization  with  LRT,  the  closest 
levels  to  /3=  ±0.0208826  are  the  end  points  at  ±0.0249277=  ±0.5crs .  (Fig. 
5.15).  Then,  ±0.0249277=±0.5<rs  are  selected  as  the  new  threshold.  The 
areas  above  0.0249277  and  below  -0.0249277  are  the  detected  areas  of 
the  signal,  i.e.,  the  intervals  of  ’b’,  ’c\  'd\  ’B\  ’C\  and  ’D’.  The  probabil¬ 
ity  of  detecting  *b’,  'c\  *d',  ’B\  ’C\  and  *D’  is  0.617.  10*0.617=6.17. 
Truncated  6.17=6.  The  approximated  number  of  detected  symbols  at 
b  ,  c  ,  d’,  ’B’,  ’C’,  and  ’D’,  is  6  symbols.  So  a  threshold  is  set,  if  the 
number  of  symbol,  'b',  ’c’,  ’d’,  ’B’,  ’C’,  or  ’D’,  is  equal  to  or  larger  than  6, 
then  the  possible  wavelet  is  detected.  Otherwise,  input  the  next  10- 
symbol  string. 


(3)  Threshold  for  the  detection  of  candidate  bright  spot 

The  exact  amplitude  of  the  signal  is  not  known  because  the  signal  is 
corrupted  by  noise.  The  probability  of  detection  cannot  be  calculated 
as  in  the  case  of  the  17  Hz  Ricker  wavelet  described  in  Section  5.2.5. 
An  alternative  method  is  proposed.  Using  the  8-level  optimal  quantiza¬ 
tion,  the  di  value  is  quantized  to  the  quantized  value  ,  the  conditional 
mean  of  each  quantization  interval.  Assume  that  Gaussian  noise  is 
added  to  the  quantized  value  c*  ,  then  the  probability  of  detection  in 
the  quantized  interval  can  be  calculated.  There  are  4  extracted 
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Figure  5.15A  8-level  optimal  quantization  encoding  for  d*  of 


Figure  5.15B  Probability  of  detection  using  the  conditional  mean 
of  each  interval  in  bright  spot  pattern 
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patterns  and  each  pattern  has  10  C<.  For  40  c4  of  the  quantized  value  of 
the  candidate  bright  spot,  the  probability  of  detecting  every  c<  can  be 
calculated  using  the  statistical  table  of  normal  distribution.  The  sum  of 
these  40  detection  probabilities  is  30.9724. 

(There  are  9  ’a’  or  ’A’,  20  ’b’  or  ’B\  0  ’c’  or  ’C\  and  3 ’d’  or  ’D’. 

9*. 705 1+20*. 7493+0  *.849+3*. 9495=30. 9724) 

For  pattern  string  of  10  symbols  of  the  candidate  bright  spot, 
30.9724/4=7.7431.  Truncated  7.7431=7. 

The  approximated  number  of  detected  symbols  is  7.  The  number  of 
missing  and  false  symbols  is  10-7=3,  i.e.,  error  or  un-detected  symbols. 
The  input  string  belongs  to  the  string  of  candidate  bright  spot  if  the 
Levenshtein  distance  is  less  than  3  symbols.  So  3  is  selected  as  a  thres¬ 
hold.  Suppose  that  x1,  xz}  x3,  and  x4  are  candidate  bright  spot  strings, 
if  dL(x1,y)<Ic1,  or  dL(x2,y)^h2,  or  dL  {x3,y)<Lk3,  or  dL(x4,y^k4,  then  y 
is  the  detected  candidate  bright  spot  string.  The  threshold  values  of 
*i,  k2,  k3,  and  k4  are  3  in  this  experiment.  The  classification  result  is 
shown  in  Fig.  5. 14C.  Compared  with  the  original  seismogram  in  Fig. 
5.14A,  Fig.  5.14C  is  quite  good. 


5.5  Conclusions 

(1)  Syntactic  pattern  recognition  uses  the  structural  information  of 
patterns.  In  a  simulated  seismic  trace,  syntactic  pattern  recognition 
technique  can  classify  the  Ricker  wavelets  which  decision-theoretic 
approach  is  not  easy  to  do. 

(2)  In  the  primitive  recognition,  there  are  two  methods  based  on 
the  optimal  quantization  encoding.  One  is  the  amplitude-independent 
encoding  which  can  not  classify  the  different  amplitude  of  f  Hz  Ricker 
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wavelets.  The  other  is  the  amplitude-dependent  encoding  which  can 
classify  every  kind  of  Ricker  wavelet. 

(3)  In  the  primitive  recognition,  assigning  the  pair  yi+\-yi  to  a 
symbol  is  a  quantization  problem.  The  pairs  in  the  Ricker  wavelets  are 
Gaussian  distributed.  Optimal  quantization  of  8  levels  for  the  Gaussian 
samples  is  used  in  this  study. 

(4)  In  the  simulated  and  real  seismograms,  tree  classification  tech¬ 
niques  are  used  in  the  testing  traces  to  extract  the  candidate  bright 
spot  patterns. 

(5)  The  roles  of  likelihood  ratio  test,  optimal  quantization  encoding, 
and  probability  of  detecting  the  signal  involving  in  the  global,  local 
detection,  and  threshold  setting  are  quite  important. 

(6)  Minimum  distance  and  nearest-neighbor  classification  rules  are 
used  in  the  classification  of  Ricker  wavelets.  In  the  detection  of  candi¬ 
date  bright  spot,  the  results  using  threshold  setting  of  string  distance 
to  detect  the  candidate  bright  spot  are  quite  satisfactory  and  can  be 
used  to  improve  seismic  interpretation. 
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CHAPTER  VI 

SYNTACTIC  PATTERN  RECOGNITION  FOR  THE 
RECOGNITION  OF  BRIGHT  SPOT 

6.1  Introduction 

In  Chapter  1,  bright  spot  is  defined  as  the  pattern  having  at  least 
one  of  the  three  physical  properties  and  a  continuous  reflection  layer  in 
a  two-dimensional  seismogram.  In  Chapter  3,  4,  and  5,  candidate  bright 
spots  are  detected  by  using  decision-theoretic  and  syntactic 
approaches.  The  detected  bright  spot  patterns  often  scatter  in  a  two- 
dimensional  seismogram.  The  detected  patterns  should  be  continuous 
and  represent  the  reflection  layer.  Singular  patterns  existing  in  the 
seismogram  do  not  form  a  reflection  layer.  In  order  to  remove  the 
singular  patterns  and  keep  the  continuous  patterns,  use  of  spatial  rela¬ 
tions  in  two-dimensional  seismogram  analysis  is  proposed. One  possibil¬ 
ity  is  to  use  a  string-to-string  correlation  algorithm  for  image  skeletoni¬ 
zation  to  find  the  reflectors  from  one  trace  to  the  next  [lu82a].  Another 
is  to  use  the  spatial  relations  in  the  analysis  of  seismogram  to  keep  the 
continuous  reflection  coefficient  and  eliminate  the  noise. 

In  this  study,  five  methods  are  proposed  to  test  the  continuity  of 
the  reflection  layer.  Local  and  global  pattern  tests  are  heuristic  and 
simple,  but  applicable  to  only  very  limited  cases.  In  order  to  test 
whether  a  continuous  reflection  layer  is  a  bright  spot  or  a  non-bright 
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spot,  syntactic  pattern  recognition  is  again  proposed  in  the  two  dimen¬ 
sional  analysis.  In  syntactic  pattern  recognition,  three  kinds  of  string 
distance  computation  are  proposed  to  test  the  continuity  of  a  bright 
spot  pattern. 


6.2  Local  Pattern  Testing 

The  classification  results  from  the  methods  described  in  Chapters 
3,  4,  or  5  are  used  for  further  processing.  The  amplitude  of  the  seismic 
data  from  the  previous  classification  result  is  0.0  for  a  non-bright  spot 
and  -0.2  for  a  candidate  bright  spot.  Data  with  amplitude  -0.2  is  called 
negative  polarity.  Each  detected  pattern  wavelet  has  about  10  point 
samples  (sampling  interval  is  0.004  seconds).  From  Dobrin  [dob76a], 
the  high  amplitude  portion  of  a  bright  spot  occupies  a  limited  horizon¬ 
tal  region.  From  real  seismograms  in  Figs.  1.1,  1.2,  and  1.3,  the 
wavelets  of  a  bright  spot  are  horizontally  continuous,  i.e.,  a  fairly  large 
number  of  traces  with  wavelets  appearing  in  the  horizontal  direction. 
So  a  horizontal  line  pattern  is  considered  here.  The  procedures  are  dis¬ 
cussed  in  the  following. 

At  first,  consider  five  points  along  the  horizontal  direction.  The 
center  point  is  a  testing  point.  A  testing  point  and  its  four  horizontal 
neighboring  points,  two  on  each  side,  are  considered  as  a  horizontal  line 
pattern  and  shown  in  the  following: 

0  0X00 

where  x  is  the  testing  point.  Because  only  five  points  are  considered 
here,  we  call  the  testing  pattern  a  local  testing  pattern.  The  number  of 
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points  in  the  local  testing  pattern  can  be  varied. 

The  local  testing  pattern  is  shifted  in  the  seismogram  from  left  to 
right  and  from  top  to  bottom.  If  three  of  the  five  horizontal  points  are 
with  negative  polarity,  then  the  five  points  form  a  reflection  pattern. 
Then,  two  methods  are  considered  below. 

(a)  All  five  points  of  the  testing  pattern  maintain  their  original  ampli¬ 
tudes. 

(b)  The  testing  point  x  is  assigned  the  amplitude  of  negative  polarity 
-0.2  and  the  original  amplitudes  of  the  four  neighboring  points  remain 
unchanged. 

Otherwise,  assign  zeros  to  the  testing  pattern  as  the  output  and  test 
the  next  pattern. 

Here  (b)  is  adopted  because  a  non-bright  spot  will  be  changed  to  a 
bright  spot  with  the  help  of  its  neighboring  bright  spots. 

Algorithm  6.1:  Local  pattern  testing 

Input: 

(1)  An  array  A(i,j),i  =  l,  •  •  ■  tM ,  j  =  \,  •  •  •  ,N  of  a  2-dimensional 
seismogram  with  N  traces  and  M  points  in  each  trace. 

(2)  The  amplitudes  of  A  (i  ,j )  are  c  =-0.2  for  a  candidate  bright  spot 
and  0.0  for  a  non-bright  spot,  where  i  is  the  index  along  the  trace,  the 
vertical  coordinate,  and  j  is  the  index  along  the  top  of  each  trace  (sta¬ 
tion),  the  horizontal  coordinate. 

(3)  Testing  pattern  length  L  and  threshold 

t  =rounding  off  - )  . 
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Output:  = 


,N ,  the  local  reflection  patterns. 


Method: 

M  Set  the  output  amplitudes  B(i,j)=  0.0, 
'  •  •  ,M,j - 1,2,  •  •  •  ,N.  Select  a  horizontal  testing  pattern  of  L 
points.  Set  the  location  of  the  center  of  the  horizontal  testing  pattern 

at  an  integer  of  ^  .  The  center  point  is  the  testing  point.  At  first, 

the  pattern  is  at  the  top  and  left  of  the  seismogram,  i.e.,  i  =  l  j  =  i. 

(2)  Let  the  number  of  negative  polarities  in  the  horizontal  testing 
pattern  of  L  points  be  tt . 

(3)  If  the  number  tt  is  less  than  the  threshold  t,  then  set  the  test¬ 

ing  point  to  zero.  Otherwise,  if  tt  is  larger  than  or  equal  to  the  thres¬ 
hold  t,  then  the  output  amplitude  of  the  testing  point  is  set 

equal  to  c=- 0.2  and  the  original  amplitudes  of  the  (L-l)  neighboring 
points  remain  unchanged  and  are  assigned  to  the  output  array  B(i,j). 

(4)  Shift  to  the  next  right  pattern  position,  i.e.,  shift  the  testing 
point  to  the  next  position.  If  the  pattern  reaches  the  right-most  edge 
of  the  seismogram,  j=N ,  then  the  pattern  is  shifted  to  the  beginning  of 
the  next  horizontal  line.  Then  go  to  Step  (2)  and  continue  until  the 

lower  right  corner  of  the  2-dimensional  seismogram  is  reached,  i.e., 

i=M,j=N. 

In  Algorithm  6.1,  the  number  L,  length  of  the  testing  pattern,  is 
determined  by  the  spatial  relation  of  seismic  traces,  i.e.,  the  distance 

between  two  traces.  The  amplitude  of  t,  threshold,  is  determined  from 
L. 

In  this  study,  the  syntactic  classification  results  in  Chapter  5  are 
used  as  the  input.  Using  Algorithm  6.1,  the  length  of  horizontal  testing 


185 


pattern  is  L— 5  and  the  threshold  it  is  3  which  are  selected  for  the 
analysis  of  three  seismograms  here.  Three  results  are  shown. 

(1)  Simulated  seismogram 

From  the  classification  result  in  the  simulated  seismogram  shown 
in  Fig.  6.1  (Fig.  5.10E),  there  are  two  missing  traces.  Using  this  local 
line  pattern  testing,  the  result  is  shown  in  Fig.  6.2.  The  missing  parts  of 
bright  spot,  the  27th  and  the  3Qth  traces,  are  filled  with  the  help  of  the 
neighboring  points  of  bright  spot. 

(2)  High  Island 

From  the  classification  result  (Fig.  6.3,  Fig.  5.14C),  there  is  one 
missing  trace  at  the  lower  left  corner  of  the  seismogram  and  there  are 
three  singular  patterns  in  the  middle  of  the  seismogram.  Using  the 
local  line  pattern  testing,  the  result  is  shown  in  Fig.  6.4.  The  missing 
part  of  bright  spot,  the  54th  trace,  is  filled  with  the  help  of  the  neigh¬ 
boring  points  of  bright  spot.  The  singular  patterns  are  removed. 

(3)  Mississippi  Canyon 

From  the  classification  result  (Fig.  6.5,  Fig.  5.12C),  there  are  many 
missing  traces  and  many  singular  patterns  in  the  seismogram.  Using 
the  local  line  pattern  testing,  the  result  is  shown  in  Fig.  6.6.  The  miss¬ 
ing  parts  of  bright  spot  are  filled  with  the  help  of  the  neighboring  points 
of  bright  spot.  The  singular  patterns  are  removed. 

Although  the  result  in  Fig.  6.6  is  good,  many  missing  traces  exist 
between  four  groups.  The  locations  of  the  four  groups  of  local  pattern 
are  the  7th  to  the  15th.  traces,  the  20 th  to  the  22th  traces,  the  28 th  to 
the  44th  traces,  and  the  48 th  to  the  51£/j.  traces  in  Fig.  6.G.  The  local 
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Figure  6.1  1-D  Classification  result  of  the  simulated  seismogram 
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Figure  6.2  2-D  Local  pattern  testing  result  of  the  simulated  seismogram 
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Figure  6.3  1-D  Classification  result  at  High  Island 
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Figure  6.4  2-D  Local  pattern  testing  result  at  High  Island 
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Figure  6.5  1-D  Classification  result  at  Mississippi  Canyon 
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Figure  6.6  2-D  Local  pattern  testing  result  at  Mississippi  Canyon 
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pattern  testing  can  not  be  applied  to  test  the  continuity  among  the  four 
group  patterns.  A  global  pattern  testing  is  proposed  in  the  following  to 
test  whether  the  local  patterns  form  a  continuous  reflection  layer. 


6.3  Global  Pattern  Testing 

After  the  local  pattern  testing,  singular  patterns  are  usually 
removed.  Only  local  patterns  exist  in  the  seismogram.  The  local  pat¬ 
terns  may  have  some  spatial  relation  between  them,  i.e.,  they  may  be 
on  the  same  reflection  layer  or  on  different  reflection  layers.  For  exam¬ 
ple  in  Fig.  6.6,  many  missing  traces  exist  among  the  four  groups.  The 
purpose  of  the  global  pattern  testing  is  to  group  the  patterns  if  they 
belong  to  the  same  reflection  layer.  The  procedure  is  discussed  in  the 
following. 

Consider  a  horizontal  line  pattern.  Six  horizontal  points  are  tested. 
The  number  of  points  in  the  global  pattern  testing  can  be  varied.  For 
this  case  we  have 

o  x  x  x  x  o 

where  a:  is  a  testing  point.  There  are  4  testing  points  in  the  pattern. 

At  first,  the  amplitudes  of  the  input  seismogram  from  the  previous 
local  pattern  testing  are  assigned  to  the  output.  The  line  pattern  is 
shifted  from  top  and  left  to  bottom  and  right  in  the  seismogram  to  test 
the  four  testing  points.  If  the  amplitudes  of  the  two  end  points  from 
these  six  points  are  -0.2,  then  the  amplitudes  of  the  four  middle  testing 
points  are  assigned  a  negative  polarity  c  =-0.2  as  the  output. 
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According  to  the  discussion  above,  Algorithm  6.2  is  presented  as 
follows. 

Algorithm  6.2:  Test  of  the  continuity  of  local  patterns 


Input:  1,  •  •  •  ,M ,  j- 1,  •••,#,  with  the  amplitudes  of  0  and 

c=~°-2  from  Algorithm  6.1,  and  a  constant  d.  The  length  of  the  hor¬ 
izontal  line  pattern  is  d  +  2. 

Output:  C(i,j),i  =  1,  •  •  •  =  •  •  ■  ,N ,  the  continuous  reflection 

layer. 

Method: 

(1)  Assign  the  amplitudes  of  B(i,j)  to 

C{i,j),i- 1,  •  •  •  ,M,j- 1,  •  •  •  ,N ,  i.e.,  the  original  amplitudes  remain 
unchanged. 

(2)  The  horizontal  testing  pattern  with  length  d+2  starts  at  the  top 
and  left  of  the  input  seismogram,  i.e.,  i  — 1,7  =  1. 

(3)  If  the  amplitudes  of  the  two  end  points  of  the  testing  pattern 
are  -0.2,  then  the  amplitudes  of  the  d  testing  points  within  the  two  end 
points  are  assigned  the  amplitude  c  =  —0.2. 

(4)  Shift  to  the  next  one  right  position  and  go  to  Step  (3),  until  the 

lower  right  corner  of  the  2-dimensional  seismogram  is  reached,  i.e., 
i=M,j  =N. 


In  Algorithm  6.2,  the  length  of  the  testing  pattern  (d+2)  depends 
on  the  spatial  relation  of  seismic  traces  and  can  be  varied.  In  this 
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experiment,  d  equals  4.  The  input  seismogram  is  the  result  from  local 
pattern  testing  of  Fig.  6.6  at  Mississippi  Canyon.  The  result  of  process¬ 
ing  Fig.  6.6  using  Algorithm  6.2  is  shown  in  Fig.  6.7.  Fig.  6.7  is  a  continu¬ 
ous  reflection  layer. 


6.4  Syntactic  Pattern  Recognition  for  the  Recognition 

of  Bright  Spot 

In  Section  6.2,  the  local  pattern  testing  is  useful  only  for  very  lim¬ 
ited  cases,  i.e.,  for  horizontal  line  patterns.  So  syntactic  pattern  recog¬ 
nition  of  bright  spot  is  proposed  to  handle  other  situations.  A  block 
diagram  of  syntactic  pattern  recognition  for  the  recognition  of  bright 
spot  is  shown  in  Fig.  6.8.  Each  candidate  bright  spot  is  an  input  seismo¬ 
gram  in  this  section. 


6.4.1  Preprocessing 

(1)  For  an  input  seismogram,  the  midpoint  of  the  duration  of  the 
candidate  bright  spot  is  extracted.  The  amplitude  of  the  midpoint  is 
c=—0.2  as  the  negative  polarity.  The  output  seismogram  is  called 
“peak  seismogram". 

(2)  Extract  the  pattern  from  the  peak  seismogram.  The  technique 
described  in  [lu82a]  may  be  used  to  link  the  reflectors  from  one  trace 
to  the  next.  Visual  inspection  is  adopted  in  the  experiment  here.  For 
the  extracted  pattern,  the  number  of  peaks  in  each  group  of  L- 5 
traces  are  larger  than  or  equal  to  t—  3.  The  number  L  of  selected 
traces  depends  on  the  spatial  relation  of  seismic  traces  and  can  be 
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Figure 
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6.7  2-D  Local  and  global  pattern  testing  result  at  Mississippi 
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Figure  6.8  A  block  diagram  of  syntactic  pattern  recognition  system 

(For  the  recognition  of  bright  spot  in  2-D  seismogram  of  candidate  bright 

spot) 


197 


varied.  The  threshold  t  is  related  to  the  number  of  selected  traces  as 
m  the  local  pattern  testing  described  in  Section  6.2.  The  points 
between  two  peaks  are  interpolated  if  the  pattern  is  extracted  and  are 
called  interpolated-peaks".  The  vertical  distance  between  two  succes¬ 
sive  peaks  is  less  than  or  equal  to  7  intervals  (sampling  interval  is  0.004 
seconds).  The  reason  for  the  selection  of  7  intervals  of  vertical 
difference  will  be  discussed  later.  Following  the  above  procedure,  the 
singular  peak  is  eliminated.  Edge  detection  technique  in  image  pro¬ 
cessing  can  not  handle  this  kind  of  problem  because  two  successive 
peaks  may  not  be  the  neighbors. 

Consider  the  thickness  of  a  reflection  layer  and  the  time  shift 
between  two  successive  peaks.  The  duration  of  7  intervals  is  7x0.004 
seconds  =  0.028  seconds  for  the  two-way  vertical  travel  time  of  P-wave. 
So  one-way  travel  time  is  0.014  seconds.  Suppose  that  from  Fig.  3.5A, 
the  P-wave  velocity  at  gas  sand  zone  is  1.737  Km/sec,  then  the  thick¬ 
ness  of  the  layer  corresponding  to  0.014  seconds  is  1.737  Km/sec 
xO. 014=24.318  meters.  If  the  distance  between  two  adjacent  traces  is 
2o  meters,  then  the  dipping  angle  is  approximately  45  degrees  which 
indicates  a  high  dipping  layer.  If  the  vertical  difference  of  two  succes¬ 
sive  peaks  is  larger  than  7  intervals,  the  pattern  is  discontinuous  and  is 
not  considered  here. 

(3)  The  difference  di=yi+1-yi  0f  vertical  coordinates  of  two  adja¬ 
cent  peaks,  (xi+1,yi+1)  and  (xi,yi),  is  assigned  a  symbol  cj*.  The  primi¬ 
tives  are  as  follows. 

Ui  =c  for  di  =  6  or  7. 

Oi  =  b  for  dj  =  4  or  5. 

Oi=n  for  d:  =  2  or  3. 
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Wi=o  for  ^  =  1,0,  or  -1. 
u>i=A  for  =  -2  or  -3. 

Ui-B  for  d*  =  -4  or  -5. 

Ui  =  C  for  di  =  -6  or  -7. 

From  the  peak  to  peak  or  the  peak  to  interpolated-peak,  the  pat¬ 
tern  is  encoded.  From  the  interpolated-peak  to  peak  or  interpolated- 
peak,  the  pattern  is  encoded  as  cj*=x. 


6.4.2  Generation  of  Training  Patterns  of  Bright  Spot 

In  Figure  1.1,  1.2,  and  1.3,  the  shape  of  a  bright  spot  pattern  may 
be  arched,  horizontal,  or  concave.  The  flat  or  concave  shape  of  bright 
spots  may  be  caused  by  the  velocity  variation  above  the  bright  spots. 
The  mathematical  equation  of  geological  structure  to  generate  the  19 
seismic  patterns  are  given  in  the  following.  The  P-wave  velocity  at  the 
gas  sand  zone  is  1.737  Km/sec.  Above  the  gas  sand  zone,  a  shale  layer 
is  assumed,  the  P-wave  velocity  is  2.59  Km/sec  which  is  the  same  as 
that  in  Fig.  3.5A. 

(l)  For  flat  and  arched  patterns 
/(i)=C(sm~^-)2-2.0  £  =  180—332 

where  C  is  0.0,  0.2,  0.3,  0.4,  0.5,  0.6,  0.7,  0.8,  0.9,  and  1.0  for  the  flat  and 
arched  patterns  respectively. 


(2)  For  concave  patterns 


199 


/(i)=c(sm^^-)2-1.0  i=  180-332 

where  C  is  -0.2,  -0.3,  -0.4,  -0.5,  -0.6,  -0.7,  -0.8,  -0.9,  and  -1.0  for  the  con¬ 
cave  patterns  respectively. 

The  training  patterns  with  different  shapes  of  bright  spot  are  simu¬ 
lated.  The  19  simulated  seismograms  of  reflection  coefficient  are  gen¬ 
erated.  Only  reflection  coefficients  are  shown  in  the  seismograms,  so 
the  seismograms  are  the  peak  seismograms.  The  shape  of  bright  spot 
can  be  changed  by  varying  the  shape  at  the  top  boundary  of  the  gas 
sand  zone.  Fig.  6.9  shows  the  flat  shape.  Fig.  6.10  -  6.10  show  the 
arched  shapes.  Fig.  6.19  -  6.27  show  the  concave  shapes.  Fig.  6.18  and 
6.27  have  7  intervals  for  the  vertical  difference  of  two  successive  peaks 
which  is  the  maximum  encoded  distance  in  this  study.  There  are  20 
traces  in  each  seismogram  of  a  bright  spot  pattern,  Training  strings  of 
19  symbols  are  encoded  for  each  seismogram.  The  encoded  string  of 
each  bright  spot  pattern  is  given  in  the  caption  of  each  figure.  Because 
the  candidate  bright  spot  of  Fig.  6.1  has  20  traces,  so  peak  seismogram 
of  20  traces  is  simulated  here.  In  this  study,  the  20  traces  of  bright 
spot  pattern  and  19  simulated  seismograms  can  handle  those  of  which 
the  number  of  traces  is  less  than  or  equal  to  20.  If  the  number  of 
traces  of  a  candidate  bright  spot  is  larger  than  20,  then  simulated 
seismograms  with  more  than  20  traces  can  also  be  generated. 

6.4.3  String  Distance  Computation 

Three  kinds  of  string  distance  computation  are  discussed  in  the  fol¬ 
lowing. 
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Figure  6.9  Simulated  peak  seismogram  of  bright  spot 
for  flat  shape,  training  string  o19 
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Figure  6. 10  Simulated  peak  seismogram  of  bright  spot 
for  arched  shape,  training  string  oao15^lo 
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Figure  6.11  Simulated  peak  seismogram  of  bright  spot 
for  arched  shape,  training  string  a4o11A* 
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Figure  6.12  Simulated  peak  seismogram  of  bright  spot 
for  arched  shape,  training  string  a*oao7AoA 4 
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Figure  6.13  Simulated  peak  seismogram  of  bright  spot 
for  arched  shape,  training  string  ba5o'lA5B 
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Figure  6. 14  Simulated  peak  seismogram  of  bright  spot 
for  arched  shape,  training  string  bzasosAsBz 
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Figure  6.15  Simulated  peak  seismogram  of  bright  spot 
for  arched  shape,  training  string  b3abazo5AzBAB 3 
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Figure  6. 16  Simulated  peak  seismogram  of  bright  so 
or  arched  shape,  training  string  cb*aWA:sB,c 
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Figure  6. 17  Simulated  peak  seismogram  of  bright  spot 
for  arched  shape,  training  string  c3&3aso3>l2£3c3 


Figure  6. 18  Simulated  peak  seismogram  of  bright  spot 
for  arched  shape,  training  string,  c3bza3o3A3BzC3 
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Figure  6.19  Simulated  peak  seismogram  of  bright  spot 
for  concave  shape,  training  string  oAol5ao 
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Figure  6.20  Simulated  peak  seismogram  of  bright  spot 
for  concave  shape,  training  string  i44ona,4 
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Figure  6.21  Simulated  peak  seismogram  of  bright  spot 
for  concave  shape,  training  string  i45o9a5 
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6.22  Simulated  peak  seismogram  of  bright  spot 
for  concave  shape,  training  string  BA5o7a5b 
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Figure  6.23  Simulated  peak  seismogram  of  bright  spot 

for  concave  shape,  training  string  B2ABA3o5a3bab 2 
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Figure  6.24  Simulated  peak  seismogram  of  bright  spot 
for  concave  shape,  training  string  i?443o5a3b4 
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Figure  6.25  Simulated  peak  seismogram  of  bright  spot 
for  concave  shape,  training  string  CB3A4o3a4b3c 
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Figure  6.26  Simulated  peak  seismogram  of  bright  spot 
for  concave  shape,  training  string  C3B3Azo3a2b3c3 


218 


0 .OSec- 


10 


20 


l.OSec- 


2.0Sec 


Figure  6.27  Simulated  peak  seismogram  of  bright  spot 
for  concave  shape,  training  string  C3BzA3o3azb3c3 
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(I)  Shifted  string  matching  method 

Levenshtein  distances  between  the  extracted  string  of  candidate 
bright  spot  and  the  training  strings  of  bright  spot  are  calculated.  Sup¬ 
pose  that  the  length  of  a  training  string  of  bright  spot  is  N  and  the 
length  of  an  extracted  string  is  M.  N  is  larger  than  or  equal  to  M.  Ini¬ 
tially,  Levenshtein  distance  is  computed  between  the  extracted  string 
of  length  M  and  the  beginning  M  symbols  of  a  training  string.  The  dis¬ 
tance  is  dj.  Because  the  lengths  of  the  two  strings  are  equal,  only  sub¬ 
stitution  errors  need  to  be  considered.  It  is  the  disadvantage  that  the 
deletion  and  insertion  errors  are  not  considered  in  the  string  distance 
computation.  Then,  Levenshtein  distance  is  computed  between  the 
extracted  string  of  length  M  and  the  next  M  symbols  of  the  training 
string.  The  distance  is  dz.  The  shifted  interval  is  1  symbol.  So  there 
are  N—M+l  distances  for  the  extracted  string  moving  across  one  train¬ 
ing  string.  Then  Levenshtein  distances  are  computed  between  the 
extracted  string  with  the  next  training  string  until  Levenshtein  dis¬ 
tances  between  the  extracted  string  and  all  the  training  strings  are 
computed.  Finally,  comparing  the  calculated  Levenshtein  distances, 
and  the  minimum  is  selected  as  the  distance  between  the  training 
strings  and  the  extracted  string. 


(II)  Modified  distance  computation 

In  [fu79a],  the  normalized  distance  is  defined  as  follows. 


d(x,y) 

max[\x\,\ y  |] 


The  purpose  is  to  take  the  variation  of  pattern  size  into  consideration. 

Here  a  modified  distance  is  defined  as 
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,y)-d(x  ,y)-  \  \x\-\y  |  | 


where  |  x  |  denotes  the  length  or  the  number  of  terminal  symbols  of  x . 
The  reason  for  this  definition  is  to  take  the  difference  of  the  lengths  of 
a:  and  y  into  consideration.  In  this  study,  y  may  be  a  substring  of  x 
and  the  length  of  y  is  shorter  than  the  length  of  x.  Deletion  error  due 
to  the  difference  in  lengths  of  x  and  y  is  considered.  Levenshtein  dis¬ 
tance  is  computed  between  the  extracted  string  and  each  training 
string.  The  minimum  is  determined  from  all  the  computed  Levenshtein 

distances  as  the  distance  between  the  training  strings  and  the 
extracted  string. 

(Ill)  Substring  matching  computation 

In  [oom82a],  Oommen  developes  an  algorithm  which  computes  the 
distance  between  a  string  and  a  substring.  The  string  x  is  divided  into 
all  possible  individual  elements  of  a  certain  subset  of  the  contiguous 
substrings,  then  Levenshtein  distances  between  all  possible  substrings 
of  x  and  substring  y  are  computed.  The  minimum  of  the  Levenshtein 

distances  is  determined  as  the  distance  between  training  string  x  and 
substring  y . 

Oommen  s  technique  can  be  expanded  to  match  a  substring  with 
several  strings.  Here,  the  strings  are  the  training  strings  of  bright  spot 
patterns  and  the  substring  is  the  extracted  string  of  a  candidate  bright 
spot.  The  minimum  of  the  Levenshtein  distances  is  determined  as  the 
distance  between  all  the  training  strings  and  the  substring. 


221 


6.4.4  Thresholding  for  the  Recognition  of  Bright  Spot  String 

From  Section  6.4.3,  the  minimum  distance  between  a  set  of  train¬ 
ing  strings  and  an  extracted  string  can  be  calculated.  Then,  set  a 

threshold  t=rounding  off  This  threshold  can  be  varied  due  to  the 

classification  results  from  the  previous  chapters.  If  the  minimum  dis¬ 
tance  is  less  than  or  equal  to  this  threshold,  then  the  extracted  string 

is  classified  as  a  bright  spot.  Otherwise,  the  extracted  string  is  a  non- 
bright  spot. 

Input  the  next  extracted  string  and  compute  the  string  distance 
described  in  Section  6.4.3,  until  all  the  extracted  strings  are  tested. 

Three  flow  charts  are  shown  in  Fig.  6.28  -  Fig.  6.30.  Fig.  6.28  uses 
the  method  (I),  moving  technique.  Fig.  6.29  uses  the  method  (II), 
revised  distance  computation.  Fig.  6.30  uses  the  method  (III),  strings 
and  substring  matching  computation. 


6.5  Experimental  Results 

Three  seismograms  of  candidate  bright  spot,  Fig.  6.1,  6.3,  and  6.5, 
are  processed.  In  the  real  seismogram  at  Mississippi  Canyon  and  High 
Island,  the  distance  between  two  adjacent  traces  is  25.54  meters,  i.e., 
64  traces  for  1  mile  (1.609  Km). 

(l)  Experiment  using  simulated  seismogram 

From  Fig.  6.1,  using  the  midpoint  of  the  duration  of  candidate 
bright  spot,  the  peak  seismogram  is  shown  in  Fig.  6.31.  From  Fig.  6.32, 
the  extracted  string  is  aaaaxaoooooooMxAM.  The  19  training  strings 
from  Section  6.4.2  are  used.  The  computations  follow  the  flow  charts  of 
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Figure  6.28  Flow  chart  of  method  (I) 
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Figure  6.29  Flow  chart  of  method  (II) 
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Figure  6.30  Flow  chart  of  method  (III) 
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Figure  6.31  Peak  seismogram  of  Figure  6.1 
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Figure  6.32  Extracted  string,  a*xao7AzxA 3,  from  simulated  seismogram 
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Fig.  6.28,  6.29,  and  6.30.  Using  method  (I)  (Fig.  6.28)  the  lengths  of  the 
extracted  string  and  the  19  training  strings  are  equal,  there  is  no  mov¬ 
ing  for  the  extracted  string.  Levenshtein  distance  is  computed  between 
each  training  string  and  the  extracted  string.  The  minimum  is  selected 
from  the  19  Levenshtein  distances  and  is  the  distance  between  the 
training  strings  and  the  extracted  string.  The  minimum  distance  is  3. 
Using  method  (II)  (Fig.  6.29)  the  minimum  distance  is  3.  Using  method 
(III)  (Fig.  6.30)  the  minimum  distance  is  2.  Threshold  is  set  as 

t  =ro unding  off  -^—=6.  Distance  2  or  3  is  less  than  the  threshold  6. 

So  the  extracted  string  is  recognized  as  a  bright  spot.  Comparing  the 
results  from  the  three  methods,  the  minimum  distance  2  using  method 
(III)  is  the  shortest. 


(II)  Experiment  using  the  data  from  Mississippi  Canyon 

From  Fig.  6.5,  using  the  midpoint  of  the  duration  of  candidate 
bright  spot,  the  peak  seismogram  is  shown  in  Fig.  6.33.  From  Fig.  6.34, 
four  extracted  strings  are  oxo  ,  ooxoxxoooooxoooo  ,  ooo  ,  ooooooxo  . 

For  the  first  string  oxo,  using  method  (I),  the  minimum  distance  is 
1.  Using  method  (II),  the  minimum  distance  is  1.  Using  method  (III), 

the  minimum  distance  is  1.  Threshold  is  set  as  t  grounding  off  — =1. 

3 

So  the  first  string  is  recognized  as  a  bright  spot. 

For  the  second  string  ooxoxxoooooxoooo ,  using  method  (I),  the 
minimum  distance  is  4.  Using  method  (II),  the  minimum  distance  is  4. 

Using  method  (III),  the  minimum  distance  is  4.  Threshold  is  set  as 
1 6 

=rounding  off  “^—=5.  So  the  second  string  is  recognized  as  a  bright 

spot. 
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Figure  6.33  Peak  seismogram  of  Figure  6.5 
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from  the  data  at  Mississippi  Canyon 
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Figure  6.34  Continued 
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For  the  third  string  ooo ,  using  method  (I),  the  minimum  distance  is 
0.  Using  method  (II),  the  minimum  distance  is  0.  Using  method  (III), 

the  minimum  distance  is  0.  Threshold  is  set  as  t  grounding  off  — 1 . 

3 

So  the  third  string  is  recognized  as  a  bright  spot. 

For  the  fourth  string  ooooooxo ,  using  method  (I),  the  minimum  dis¬ 
tance  is  1.  Using  method  (II),  the  minimum  distance  is  1.  Using 
method  (III),  the  minimum  distance  is  1.  Threshold  is  set  as 

t— rounding  off  — =3.  So  the  fourth  string  is  recognized  as  a  bright 

vJ 

spot. 

The  four  extracted  strings  at  Mississippi  Canyon  are  all  recognized 
as  bright  spots. 


(Ill)  Experiment  using  the  data  from  High  Island 

From  Fig.  6.3,  using  the  midpoint  of  the  duration  of  candidate 
bright  spot,  the  peak  seismogram  is  shown  in  Fig.  6.35.  From  Fig.  6.36, 
the  extracted  string  is  00200000000.  Using  method  (I),  the  minimum 
distance  is  1.  Using  method  (II),  the  minimum  distance  is  1.  Using 
method  (III),  the  minimum  distance  is  1.  Threshold  is  set  as 

t  grounding  off  So  the  extracted  string  is  recognized  as  a 

u 

bright  spot. 
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Figure  6.35  Peak  seismogram  of  Figure  6.3 
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Figure  6.36  Extracted  string,  o2zo0,  from  the  data  at  High  Island 
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6.6  Discussions 

(1)  The  length  and  the  threshold  in  the  local  and  global  testing  pat¬ 
terns  can  be  varied.  The  distance  between  two  adjacent  traces  is  con¬ 
sidered  in  the  determination  of  the  length  and  the  threshold  of  the  pat¬ 
tern.  Interpolation  techniques  may  be  used  in  the  interpolation  of  the 
traces.  The  threshold  may  be  changed. 

(2)  Each  detected  candidate  bright  spot  pattern  has  about  10  sam¬ 
ples.  From  the  real  seismogram  in  this  study,  the  wavelets  of  a  bright 
spot  are  horizontally  continuous.  So  horizontal  line  pattern  is  used  for 
local  and  global  pattern  tests.  But  the  method  is  only  useful  for  very 
limited  cases. 

(3)  Syntactic  pattern  recognition  technique  can  be  used  to  recog¬ 
nize  the  reflector  of  a  bright  spot. 

(4)  The  difference  of  the  vertical  coordinates  of  two  successive 
peaks  is  not  larger  than  7  intervals  (sampling  interval  is  0.004  seconds) 
in  this  study. 

(5)  The  20  traces  of  a  bright  spot  pattern  and  the  19  simulated 
seismograms  generated  as  the  training  patterns  of  bright  spot  can  be 
used  to  handle  the  bright  spot  patterns  of  which  the  number  of  traces 
is  less  than  or  equal  to  20.  If  the  number  of  traces  of  a  candidate 
bright  spot  is  larger  than  20,  then  simulated  seismograms  with  more 
than  20  traces  need  to  be  generated. 

(6)  From  the  experimental  results,  the  extracted  strings  from  the 

real  data  at  Mississippi  Canyon  and  High  Island  are  recognized  as  bright 
spots. 


236 


CHAPTER  VII 

SUMMARY  AND  RECOMMENDATIONS 

7. 1  Summary 

We  have  studied  the  application  of  decision-theoretic  and  syntactic 
pattern  recognition  techniques  to  the  detection  of  bright  spot.  The 
difficulty  of  applying  pattern  recognition  techniques  in  this  study  lies  in 
the  feature  extraction.  Analytic  signal  analysis  can  extract  the  features 
from  the  original  signal.  In  decision-theoretic  pattern  recognition,  our 
studies  concentrate  on  the  extraction  of  physical  properties.  In  syntac¬ 
tic  pattern  recognition,  our  studies  concentrate  on  the  structural  infor¬ 
mation  of  waveform  pattern. 

We  may  summarize  this  study  as  follows: 

(1)  Decision-theoretic  and  syntactic  pattern  recognition  techniques 
are  used  for  the  classification  of  Ricker  wavelets  and  the  detection  of 
candidate  bright  spot. 

(2)  Analytic  signal  analysis  can  be  used  to  extract  envelope,  instan¬ 
taneous  frequency,  and  polarity  as  the  features. 

(3)  From  the  analysis  of  zero-phase  Ricker  wavelets,  tree 
classification  is  adopted.  Using  the  structural  information  of  string 
wavelets,  syntactic  pattern  recognition  for  bright  spot  detection  is 


established. 
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(4)  Three  hypotheses  are  proposed  as  the  constrained  conditions  to 
detect  the  candidate  bright  spot  in  the  tree  classification. 

(5)  In  syntactic  pattern  recognition,  the  roles  of  likelihood  ratio 
test,  optimal  quantization  encoding,  and  the  probability  of  detecting 
signal  involving  in  the  global  and  local  detection,  and  the  threshold  set¬ 
ting  to  detect  the  candidate  bright  spot  are  quite  important. 

(6)  Spatial  relation  in  a  two  dimensional  seismogram  can  be  used  to 
test  the  continuity  of  reflection  layer  and  remove  the  singular  patterns. 

(7)  Syntactic  pattern  recognition  technique  can  be  used  to  recog¬ 
nize  the  string  representation  of  a  bright  spot  pattern  in  a  two  dimen¬ 
sional  seismogram.  Three  kinds  of  string  distance  computation  are 
proposed. 

(8)  In  simulation  and  real  data  experiments,  the  classification 
results  can  be  used  to  improve  seismic  interpretation  and  provide  a 
reference  for  the  prediction  of  gas  accumulation. 

(9)  In  real  applications,  if  physical  properties  of  the  waveform  pat¬ 
tern  are  separable,  (i.e.,  envelope  is  higher  than  0.15  and  the  difference 
of  the  instantaneous  frequencies  of  the  two  class  centers  for  bright  spot 
and  non-bright  spot  is  larger  than  3.0  Hz)  the  result  of  tree 
classification  is  good.  If  the  instantaneous  frequency  is  not  separable, 
then  the  result  from  syntactic  pattern  recognition  is  better. 

It  is  known  that  the  bright  spot  technique  (predicting  by  the  high 
amplitude  portion  of  the  reflection)  has  had  70  percent  success  [flo76a] 
in  locating  gas  accumulations  in  offshore  Nigeria,  Indonesia,  and  the 
United  States  Gulf  Coast.  It  should  be  interesting  to  use  the  proposed 
tree  classification  techniques  and  syntactic  pattern  recognition  tech¬ 
niques  to  detect  the  bright  spot  and,  hopefully,  to  increase  the  success 
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of  future  utilizations  of  bright  spot  technique. 


7.2  Recommendation 

The  following  problems  are  recommended  for  further  investigation: 

(l)  In  Chapter  3,  zero-phase  and  minimum-phase  wavelets  are 
decided  by  geophysicist,  an  automatic  method  to  determine  the  dom¬ 
inant  wavelet  is  necessary.  Waveform  shaping  filter  can  be  used  to 
change  the  wavelet  to  zero-phase  Ricker  wavelet,  then  a  tree  classifier 
is  easy  to  design. 

1,2)  In  Chapter  6,  an  automatic  method  to  link  peaks  as  a  string 
should  be  developed. 

(3)  The  application  of  pattern  recognition  techniques  to  the  recog¬ 
nition  of  other  patterns  in  the  seismogram,  for  example,  flat  spot  pat¬ 
tern,  pinchout  pattern,  •  •  •  ,  etc  [dob?6a,  pan70a,  pay77a,  she74a, 
she75a,  she76a]  sholud  be  an  interesting  research  project. 

(4)  Analytic  signal  analysis  has  produced  a  certain  number  of 
features  useful  in  seismic  pattern  recognition;  by  using  image  process¬ 
ing  techniques,  new  features  may  be  extracted  that  could  improve  the 
pattern  recognition  results. 

(5)  The  synthetic  seismogram  may  be  generated  by  using  sophisti¬ 
cated  methods.  When  diffraction,  reflection  and  refraction  patterns  are 
included  in  the  synthetic  seismogram,  the  application  of  pattern  recog¬ 
nition  techniques  to  seismic  analysis  should  be  a  challenging  work. 

(6)  The  two  dimensional  pattern  of  gas  reflection  is  aflected  by  the 
velocity  above  the  gas  sand  zone.  Velocity  effect  may  be  considered  in 
the  syntactic  approach. 
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